Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regeneration of netcdf #7

Closed
ifenty opened this issue Jun 25, 2019 · 4 comments
Closed

regeneration of netcdf #7

ifenty opened this issue Jun 25, 2019 · 4 comments

Comments

@ifenty
Copy link
Contributor

ifenty commented Jun 25, 2019

Feedback from PO.DAAC

https://podaac.jpl.nasa.gov/PO.DAAC_DataManagementPractices

  1. Add time_coverage_resolution to native grid fields, P1M (monthly) or P1D (daily).

  2. Fix time_coverage_resolution on the interpolated files (remove extra "" fields)

  3. Remove all of our custom standard names : http://cfconventions.org/Data/cf-standard-names/67/build/cf-standard-name-table.html

  4. Add comments defining what all the variables are. Some are obvious, but not all of them, especially the ancillary file. Add this information to our "variable" JSON files.

  5. Remove non-essential grid information from the native grid files.

  6. Add metadata comment to every netcdf file telling the users that the grid information is in the netcdf grid file.

  7. UUID to each file

  8. Valid range is an ARRAY

  9. Time needs standard NAME

  10. What is SWAP dimenion?

  11. Use valid range (min, max)

  12. SSH use Sea surface height above geoid

  13. Use comments instead of 'LONG NAME'

  14. Find out what 'grid mapping' means

  15. use same time format in time coverage start, time coverage end and other places. ** whether to have 'T' included or not.

  16. geospatial vertical units 'm'

  17. change canonical units from (kg/m^3/)m to format: kg m-3 m-1, and degree_C

  18. use cell bounds

@ifenty ifenty pinned this issue Jun 25, 2019
@ifenty
Copy link
Contributor Author

ifenty commented Nov 27, 2019

@owang01
Fields that need to be added:

volume budget

  • sIceLoad snapshots (to calculate \partial ETAN+sea ice + snow volume / \partial t$

momentum budget (pending confirmation from Mazloff)

  • UVEL, VVEL, WVEL snapshots (to calculate $\partial momentum / \partial t$)
  • UVELMASS, VVELMASS snapshots [would these be useful too since they are the are scaled by s*?]
  • Um_Diss, Vm_Diss, (momentum tendency from Dissipation)
  • Um_Advec, Vm_Advec, (momentum tendency from Advection)
  • Um_Cori , Vm_Cori, (momentum tendency from Coriolis)
  • Um_Ext, Vm_Ext, (external forcing)
  • TOTUTEND, TOTVTEND (these are actual tendencies)
  • VISrI_Um, VISrI_Vm (vertical viscous flux of momentum)

@owang01
Copy link
Contributor

owang01 commented Nov 27, 2019

@ifenty

The sIceLoad snapshots can be simply calculated by summing SIheff and SIsnow snapshots (weighted by their own density). Both SIheff and SIsnow snapshots are available https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/nctiles_monthly_snapshots for monthly interval and https://data.nas.nasa.gov/ecco/data.php?dir=/eccodata/llc_90/ECCOv4/Release4/nctiles_daily_snapshots for daily interval.

@ifenty
Copy link
Contributor Author

ifenty commented Nov 28, 2019

We also need the LAST snapshots of all the snapshot fields (20XX-12-31)

@ifenty
Copy link
Contributor Author

ifenty commented Jan 24, 2020

2019-01-24

issue 1:
latitude and longitude and depth need to have the 'axis' string attributes
axis = 'X'
axis = 'Y'
axis = 'Z'

issue 2
can get suppress the variable of i,j,k?

issue 3
can the order of 'coordinates' be reconciled with the order of the actual array

issue 4
timestep should have better description (number of hours)

issue 5
time_bnds needs to say days since 1992-01-01

issue 6
why is there a global attribute of "coordinates" that is only "time_bnds"

** issue 7 **
make 'date created' the same format (iso) as the other times"

** issue 7b**

noticed that the iso format for the times are not exactly identical everywhere.

** issue 8 **
need entire heirarchy of GCMD, separated by a '>' between the heiracy elements and a ',' between the different GCMD keywords.

** issue 9 **
add 'comment' field to global attributes that includes

** issue 10 **
in the 'title field' include the "dataset name"
e.g.,
ECCO v4 Release 3 Potential Temperature and Salinity

issue 11
consider using the global attribute 'history' field : could include data
consider using the global attribute 'source' field : "source should name the model and version"

issue 12
data best practices

https://podaac.jpl.nasa.gov/PO.DAAC_DataManagementPractices

"references" | string | Published or web-based references that describe the data or methods used to produce it. Recommend URIs (such as a URL or DOI) for papers or other references. This attribute is defined in the CF conventions.

issue 13
use the attribute to use "coverage_content_type" on all variables and coordinates

       # ISO 19115-1 codes
        valid_ctypes = {
            'image',
            'thematicClassification',
            'physicalMeasurement',
            'auxiliaryInformation',
            'qualityInformation',
            'referenceInformation',
            'modelResult',
            'coordinate'

for spatial measures we could use:
auxiliaryInformation
referenceInformation

for the majority of fields it'll be 'modelResult' or 'coordinate'

issue 14
NASA best practices specifies that for satellite datasets there should not be a _FillVlalue for time variable

issue 15
units should have this format m2 s-2

issue 16
no need for nx, ny, or nz

issue 17
"processing_level" must have an underscore. ncdump shows a space, and panoply puts an underscore.

issue 18
compare file sizes using 0 as fill values vs. max float.

issue 19
compression (level 1 vs level 9)
https://www.unidata.ucar.edu/software/netcdf/workshops/2011/utilities/Nccopy.html

https://www.unidata.ucar.edu/blogs/developer/entry/netcdf_compression

issue 20
netcdf file naming conventions

issue 21
dataset naming convention

issue 22
compliance checker

https://podaac-tools.jpl.nasa.gov/mcc/

issue 23
can xarray use compression when generating netcdf?

http://xarray.pydata.org/en/stable/generated/xarray.Dataset.to_netcdf.html

** issue 24**
grid cell area

http://mathforum.org/library/drmath/view/63767.html
t is a consequence of a theorem of Archimedes (c. 287-212 BCE) that for a spherical model of the earth, the area of a cell spanning longitudes l0 to l1 (l1 > l0) and latitudes f0 to f1 (f1 > f0) equals

(sin(f1) - sin(f0)) * (l1 - l0) * R^2

where
    l0 and l1 are expressed in radians (not degrees or whatever).
    l1 - l0 is calculated modulo 2*pi (e.g., -179 - 181 = 2 degrees, not -362 degrees).
    R is the authalic Earth radius, almost exactly 6371 km.
The area of a lat-long rectangle is proportional to the difference in 
the longitudes. The area I just calculated is the area between 
longitude lines differing by 360 degrees. Therefore the area we seek 
is

  A = 2*pi*R^2 |sin(lat1)-sin(lat2)| |lon1-lon2|/360
    = (pi/180)R^2 |sin(lat1)-sin(lat2)| |lon1-lon2|

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants