Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE REQUEST] Store CEDS emission data in one file per year? #12

Closed
JiaweiZhuang opened this issue Aug 20, 2018 · 7 comments
Closed
Labels
category: Feature Request New feature or request topic: Input Data Related to input data

Comments

@JiaweiZhuang
Copy link
Contributor

The newly-added CEDS data are 140 GB in total. This increases the size of the default HEMCO data directory from ~70 GB to ~210 GB.

The problem is that 65 years (1950~2014) of data are stored in a single file. But most users would probably just need recent years. Would it be more reasonable to use one file per year? Other biggish data such as QFED are also stored in a per-year basis. This saves downloading time and also reduces the size of the tutorial AMI on AWS from 200+ GB to ~80 GB.

Breaking the CEDS data into 65 files (years) can be done by 3 lines of Python:

ds = xr.open_mfdataset(MAINDIR+'*-em-anthro_CMIP_CEDS_195001-201412.nc')
for year in range(1950, 2015):
    ds.sel(time=str(year)).to_netcdf('~/output/CEDS_{}.nc'.format(year))

See this notebook for a walk through.

@JiaweiZhuang
Copy link
Contributor Author

JiaweiZhuang commented Aug 20, 2018

PS: Found 2 typos in the HEMCO data wiki:

  • Size of C2H6_2010/v2017-05 should be 540K, not 140 GB
  • $ROOT/NEI201ek/2018-04 -> $ROOT/NEI2011ek/2018-04

Also, newly added CHEM_INPUTS/MODIS_LAI_201707/ is not uploaded to FTP.

@msulprizio
Copy link
Contributor

Hi Jiawei. We can certainly split the CEDS files into yearly files to simplify data downloads. I will look into adding this change to 12.1.0. We are including other CEDS fixes in that version.

@JiaweiZhuang
Copy link
Contributor Author

Thanks!

@msulprizio
Copy link
Contributor

Also, the typos have been fixed on the HEMCO data wiki page.

I can access the MODIS LAI data at
ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/CHEM_INPUTS/MODIS_LAI_201707/For_Olson_2001/. Can you not access that data?

@JiaweiZhuang
Copy link
Contributor Author

Just found it. It turns out that I was searching inside ftp://ftp.as.harvard.edu/gcgrid/data/ExtData/GEOS_NATIVE/, which contains MODIS_LAI_200911 and MODIS_LAI_201204 but no MODIS_LAI_201707.

@msulprizio
Copy link
Contributor

OK great!

@msulprizio
Copy link
Contributor

A fix for this issue has now been pushed to the dev/12.1.0 branch in geos-chem-unittest. Yearly files can now be found in HEMCO/CEDS/v2018-08. In addition, the files in HEMCO/CEDS/v2018-04 have been compressed using nccopy -d1, bringing the directory size from 140 GB to 26 GB.

jimmielin pushed a commit to jimmielin/geos-chem that referenced this issue Dec 1, 2018
We have added comments to note that the sum of tagged O3 species geoschem#2 - geoschem#12
should match the total O3 species geoschem#1.

We also added a blurb reminding users to start with a zero-concentration
initial conditions file, and then to spin up for as many years as it
takes to get into steady-state.

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
@msulprizio msulprizio changed the title Store CEDS emission data in one file per year? [FEATURE REQUEST] Store CEDS emission data in one file per year? Sep 4, 2019
@msulprizio msulprizio added the category: Feature Request New feature or request label Sep 4, 2019
@msulprizio msulprizio added the topic: Input Data Related to input data label Dec 10, 2019
yantosca added a commit that referenced this issue May 12, 2022
- Rename files ending in .ccycle to end in .carboncycle

- Replace GeosCore/ccyclechem_mod.F90 w/ GeosCore/carboncycle_mod.F90,
  and KPP/ccycle/ccycle_Funcs.F90 w/ KPP/carboncycle/carboncycle_Funcs.F90.
  Also update every place where these modules are referenced
  in USE statements and CMakeLists throughout the code.

- Added a stub module KPP/stubs/stub_carboncycle_Funcs.F90 to allow
  compilation to continue for fullchem & Hg mechanisms.
- Added template run-directory files for carboncycle mechanism

- run/GCClassic/createRundir.sh now uses option #12 for creating
  carboncycle mechansism run directories

- Built the carboncycle mechanism with KPP 2.5.0

- Changed "ccycle" to "carboncycle" in KPP/OHreact_parser.py, so that
  it will set "OHreact = 0.0_dp" when writing the OH reactivity routine.

Signed-off-by: Bob Yantosca <yantosca@seas.harvard.edu>
lizziel pushed a commit that referenced this issue Apr 16, 2024
RC option: Run2 VMBarrier detects load imbalance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: Feature Request New feature or request topic: Input Data Related to input data
Projects
None yet
Development

No branches or pull requests

2 participants