Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow discrete sampling geometries with 1-d data to be written as ragged arrays, and improve the compression process #287

Closed
davidhassell opened this issue Feb 20, 2024 · 0 comments · Fixed by #288
Labels
compression enhancement New feature or request
Milestone

Comments

@davidhassell
Copy link
Contributor

davidhassell commented Feb 20, 2024

Currently, a 1-d DSG can not be compressed so that it is written out to netCDF file as a ragged array. E.g.

>>> print(dsg)
Field: mole_fraction_of_ozone_in_air (ncvar%O3_TECO)
----------------------------------------------------
Data            : mole_fraction_of_ozone_in_air(ncdim%obs(11160)) ppb
Auxiliary coords: time(ncdim%obs(11160)) = [2017-07-03 11:15:07, ..., 2017-07-03 14:21:06] standard
                : altitude(ncdim%obs(11160)) = [2577.927001953125, ..., 151.16905212402344] m
                : air_pressure(ncdim%obs(11160)) = [751.6758422851562, ..., 1006.53076171875] hPa
                : latitude(ncdim%obs(11160)) = [52.56147766113281, ..., 52.0729866027832] degree_north
                : longitude(ncdim%obs(11160)) = [0.3171832859516144, ..., -0.6249311566352844] degree_east
                : cf_role=trajectory_id(cf_role=trajectory_id(1)) = [STANCO]

This can be solved be making it possible to insert the cf_role=trajectory_id dimension into the data and appropriate metadata constructs, so it would look like (note that the cf_role=trajectory_id construct remains unchanged):

Field: mole_fraction_of_ozone_in_air (ncvar%O3_TECO)
----------------------------------------------------
Data            : mole_fraction_of_ozone_in_air(cf_role=trajectory_id(1), ncdim%obs(11160)) ppb
Auxiliary coords: time(cf_role=trajectory_id(1), ncdim%obs(11160)) = [[2017-07-03 11:15:07, ..., 2017-07-03 14:21:06]] standard
                : altitude(cf_role=trajectory_id(1), ncdim%obs(11160)) = [[2577.927001953125, ..., 151.16905212402344]] m
                : air_pressure(cf_role=trajectory_id(1), ncdim%obs(11160)) = [[751.6758422851562, ..., 1006.53076171875]] hPa
                : latitude(cf_role=trajectory_id(1), ncdim%obs(11160)) = [[52.56147766113281, ..., 52.0729866027832]] degree_north
                : longitude(cf_role=trajectory_id(1), ncdim%obs(11160)) = [[0.3171832859516144, ..., -0.6249311566352844]] degree_east
                : cf_role=trajectory_id(cf_role=trajectory_id(1)) = [STANCO]

This can be done be add a constructs keyword to cf.Field.insert_dimension that works in the same was as the same keyword on cf.Field.transpose.

Edit: To be clear, this is about allowing a manipulation that turns a 1-d DSG into a 2-d one!

Whilst we're at it, the compression process in cf.Field.compress could be improved, to avoid the following situation: If the data contains trailing missing values at positions where there are non-missing coordinate values, then those non-missing coordinate values are currently lost.

PR to follow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compression enhancement New feature or request
Projects
None yet
1 participant