Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keep attrs of get_bitinformation #158

Merged
merged 15 commits into from Nov 23, 2022
Merged

keep attrs of get_bitinformation #158

merged 15 commits into from Nov 23, 2022

Conversation

aaronspring
Copy link
Collaborator

@aaronspring aaronspring commented Oct 26, 2022

closes #154

  • changes units to bitinfo_units
  • changes long_name to bitinfo_long_name
  • keep input attrs

@aaronspring aaronspring self-assigned this Oct 26, 2022
@aaronspring aaronspring added the enhancement New feature or request label Oct 26, 2022
@aaronspring aaronspring marked this pull request as ready for review October 26, 2022 17:36
Copy link
Owner

@observingClouds observingClouds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why you would like to have the original units and long_name from the dataset? Do we loose these attributes when writing to netCDF/zarr?
The unit of the data that info_per_bit = xb.get_bitinformation(ds) contains is 1 and not e.g. ms-1 if the input data was wind speed.

@aaronspring
Copy link
Collaborator Author

I want to keep long_name and all other attrs for context. Now we only have the variable name and no context from the input data. Original unit might not be very important

@aaronspring
Copy link
Collaborator Author

image

Now the bitinformation of z, in reference to what z actually is, is easier to understand IMO.

@observingClouds
Copy link
Owner

The attributes are preserved in ds_bitrounded:

In [11]: ds_bitrounded.u
Out[11]: 
<xarray.DataArray 'u' (month: 2, level: 3, latitude: 241, longitude: 480)>
array([[[[ 1.250000e+00,  1.250000e+00,  1.250000e+00, ...,
         [ 3.500000e+00,  3.500000e+00,  3.500000e+00, ...,
           3.750000e+00,  3.500000e+00,  3.500000e+00]]]], dtype=float32)
Coordinates:
  * longitude  (longitude) float32 -180.0 -179.2 -178.5 ... 177.8 178.5 179.2
  * latitude   (latitude) float32 90.0 89.25 88.5 87.75 ... -88.5 -89.25 -90.0
  * level      (level) int32 200 500 850
  * month      (month) int32 1 7
Attributes:
    number_of_significant_digits:                2
    units:                                       m s**-1
    long_name:                                   U component of wind
    standard_name:                               eastward_wind
    _QuantizeBitRoundNumberOfSignificantDigits:  3

@aaronspring
Copy link
Collaborator Author

I know. I just refered to info_per_bit the output from get_bitinformation

@aaronspring aaronspring changed the title keep attrs keep attrs of get_bitinformation Nov 21, 2022
@observingClouds
Copy link
Owner

I don't really like this PR yet, mainly because the units, long_name and standard_name attributes are cf-conventions conform attributes that describe the attributes of the variable and not the ones they are derived from.
A compromise could be to prefix the attributes from the input dataset with e.g. source_, so that there would be source_units = m**2 s**-2, source_standard_name = geopotential etc. So instead of prefixing the bitinformation attributes, the original dataset attributes get a prefix. Would that help?

@aaronspring
Copy link
Collaborator Author

I like the source idea

@aaronspring aaronspring merged commit 064c90f into main Nov 23, 2022
@aaronspring aaronspring deleted the keep_attrs branch November 23, 2022 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

keep attrs in get_bitinformation
2 participants