-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add observation preparation job for aerosols DA to workflow #2624
Add observation preparation job for aerosols DA to workflow #2624
Conversation
…AA-EMC#2614) - Updating STMP and PTMP settings in host file for Orion and Hercules because they are cross mounted. - Also took the opportunity to finally update **SLURM_ACCOUNT** to **HPC_ACCOUT** in CI over rides. - Added a refactor of the `rocotostat.py` tool that is more pythonic and as a execute retry feature because the `rocotostat` utility on Orion has been failing sometimes.
* origin: Change GRIB2 parameter names and vertical levels for ocean/ice post (NOAA-EMC#2611) Add atmensanlfv3inc job (NOAA-EMC#2592) Global-workflow (AR) Generic updates for Gaea C5 (NOAA-EMC#2515) Update STMP and PTMP settings in host file for Orion and Hercules (NOAA-EMC#2614)
@CoryMartin-NOAA @andytangborn This PR is ready for review. Any comments and suggestions are welcome. |
jobs/JGLOBAL_PREP_AERO_OBS
Outdated
############################################## | ||
|
||
# Generate COM variables from templates | ||
YMD=${PDY} HH=${cyc} declare_from_tmpl -rx COM_OBS COM_CHEM_ANALYSIS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COM_OBS
should be considered read-only. In development we copy obs from the dump archive to a location within our directories, but in operations we use them directly from obsproc. We can make new prep COM variables in config.com for any components that need them.
@CoryMartin-NOAA @guillaumevernieres I think there are other jobs recently that have the same issue that we just let through that will need to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to have a bigger discussion on this. There are obs for the non-atmospheric components that are not being processed through the traditional obsproc channels (in particular the aerosol and SOCA datasets). I'll send an email to follow up and schedule a meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My solution would be to create a COM_<component>_PREP_TMPL
(or similar; waves already has a prep) to store pre-processed stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments
parm/config/gfs/config.prepaeroobs
Outdated
|
||
export OBSPROCYAML="${PARMgfs}/gdas/aero/obs/lists/gdas_aero_obsproc.yaml.j2" | ||
export OBSPROCEXE="${EXECgfs}/gdas_obsprovider2ioda.x" | ||
export VIIRS_DATA_DIR="/scratch2/NCEPDEV/stmp3/Yaping.Wang/VIIRS/AWS/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we'll have to change this to a more "official" location for at least the CI test period before the PR can be successfully merged in.
List needed raw obs files. | ||
Copy the raw obs files to $DATA/obs. | ||
Link over the needed executable. | ||
Generate corrosponding YMAL file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generate corrosponding YMAL file. | |
Generate corresponding YMAL file. |
""" | ||
self.task_config.COM_OBS_CHEM = os.path.join(self.task_config.COM_OBS, 'chem') | ||
if os.path.exists(self.task_config.COM_OBS_CHEM): | ||
rmdir(self.task_config.COM_OBS_CHEM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we positive we want to remove the directory if it exists?
HH = fshort[0][26:28] | ||
MM = fshort[0][28:30] | ||
fstart = to_datetime(yyyy + '-' + mm + '-' + dd + 'T' + HH + ':' + MM + 'Z') | ||
if sensor in file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to be more robust, for example, what if the directory path is:
/work2/noaa/cmartin/exp/use-n20/data/n21.nc
, it will be true for both n20 and n21.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still think this comment stands, but perhaps I'm worrying too much.
Can you do if sensor in os.path.basename(file)
or something like that?
dd = fshort[0][24:26] | ||
HH = fshort[0][26:28] | ||
MM = fshort[0][28:30] | ||
fstart = to_datetime(yyyy + '-' + mm + '-' + dd + 'T' + HH + ':' + MM + 'Z') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fstart = to_datetime(yyyy + '-' + mm + '-' + dd + 'T' + HH + ':' + MM + 'Z') | |
fstart = to_datetime(f'{yyyy}-{mm}-{dd}T{HH}:{MM}Z') |
@ypwang19 can you merge in develop and resolve the conflicts? I'll review what is here in the meantime |
export npe_prepobsaero=1 | ||
export nth_prepobsaero=1 | ||
export npe_node_prepobsaero=1 | ||
export memory_prepobsaero="96GB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this prep job really need this much memory?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not, but it is somewhat intensive, but this is probably an order of magnitude too large, I can test next week with less memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be addressed later.
FYI @ypwang19 is now on maternity leave, so I'll have to pick up this PR to get it finished, so there may be some delay in responses to reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good overall, just one fix for WCOSS2 and a few suggestions.
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
@aerorahul @DavidHuber-NOAA @WalterKolczynski-NOAA Yaping is now on maternity leave. What do you suggest in terms of getting this PR over the finish line? My suggestions in no particular order are:
|
parm/config/gfs/config.base
Outdated
@@ -178,6 +178,7 @@ export DO_WAVE="NO" | |||
export DO_OCN="NO" | |||
export DO_ICE="NO" | |||
export DO_AERO="NO" | |||
export DO_PREP_OBS_AERO="YES" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@CoryMartin-NOAA How about setting this to NO
for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think that is all that is needed to ensure it doesn't break anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of remaining issues, but those can be addressed later IMO. Once an issue is created, I'm good with merging this.
# Set variables used in the script | ||
############################################## | ||
|
||
export COMIN_OBS="${DATA}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will likely need to be updated, but that can wait since the new job is now off by default. Can you create an issue for this @CoryMartin-NOAA?
|
||
export OBSPROCYAML="${PARMgfs}/gdas/aero/obs/lists/gdas_aero_obsproc.yaml.j2" | ||
export OBSPROCEXE="${EXECgfs}/gdas_obsprovider2ioda.x" | ||
export VIIRS_DATA_DIR="/scratch2/NCEPDEV/stmp3/Yaping.Wang/VIIRS/AWS/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this could be wrapped into the new issue.
export npe_prepobsaero=1 | ||
export nth_prepobsaero=1 | ||
export npe_node_prepobsaero=1 | ||
export memory_prepobsaero="96GB" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be addressed later.
try: | ||
for file in allfiles: | ||
basename = os.path.basename(file) | ||
pattern = r"s(\d{4})(\d{2})(\d{2})(\d{2})(\d{2})(\d{3})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pattern
could be defined outside the for loop.
logger.info("Found %d matching files.", len(matching_files)) | ||
except FileNotFoundError: | ||
logger.error("The specified file/directory does not exist.") | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should raise FileNotFoundError(message)
* origin/develop: Add observation preparation job for aerosols DA to workflow (NOAA-EMC#2624) Remove ocean daily files (NOAA-EMC#2689) Update Jenkinsfile Add Hercules-EMC to the Jenkins configurable parameter list (NOAA-EMC#2685) Update gdas.cd and gsi_utils hashes (NOAA-EMC#2641) Add ability to use GEFS replay ICs (NOAA-EMC#2559) Replace `sleep` with `wait_for_file` (NOAA-EMC#2586) Add COM template for JEDI obs (NOAA-EMC#2678) Link both global-nest fix files and non-nest ones at the same time (NOAA-EMC#2632) Update ufs-weather-model (NOAA-EMC#2663) Add ability to process ocean/ice products specific to GEFS (NOAA-EMC#2561) Update cleanup job to use COMIN/COMOUT (NOAA-EMC#2649) Add overwrite to creat experiment in BASH CI (NOAA-EMC#2676) Add handling to select CRTM cloud optical table based on cloud scheme and update calcanal_gfs.py (NOAA-EMC#2645) Refs NOAA-EMC#2475
Add a prepaeroobs job to prepare aerosol obs files for DA. This job does quality control of the VIIRS aerosol raw observations and convert them to ioda format. Resolves #2623 --------- Co-authored-by: ypwang19 <yaping.wang@users.noreply.github.com> Co-authored-by: TerrenceMcGuinness-NOAA <terrence.mcguinness@cox.net> Co-authored-by: Cory Martin <cory.r.martin@noaa.gov> Co-authored-by: David Huber <69919478+DavidHuber-NOAA@users.noreply.github.com>
Description
This PR added a prepaeroobs job to prepare aerosol obs files for DA. This job does quality control of the VIIRS aerosol raw observations and convert them to ioda format.
Resolves #2623
Type of change
Change characteristics
How has this been tested?
Checklist