Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'gpm' is not a defined unit #907

Closed
sgdecker opened this issue Jul 31, 2018 · 12 comments · Fixed by #925
Closed

'gpm' is not a defined unit #907

sgdecker opened this issue Jul 31, 2018 · 12 comments · Fixed by #925
Labels
Area: Units Pertains to unit information Type: Enhancement Enhancement to existing functionality
Milestone

Comments

@sgdecker
Copy link
Contributor

sgdecker commented Jul 31, 2018

The following code:

from datetime import datetime
import numpy as np
import xarray as xr
from metpy.units import units
import metpy.calc as mpcalc

# Grab a recent GFS run on Grid 3
base_url = 'https://www.ncei.noaa.gov/thredds/dodsC/gfs-003-files/'
dt = datetime(2018, 7, 23, 0)
data = xr.open_dataset('{}{dt:%Y%m}/{dt:%Y%m%d}/gfs_3_{dt:%Y%m%d}_'
                       '{dt:%H}00_000.grb2'.format(base_url, dt=dt),
                       decode_times=True)

# Compute grid spacing in lat/lon space
x = data.lon.values.astype('float64')
y = data.lat.values.astype('float64')

lons = np.deg2rad(x)
lats = np.deg2rad(y)

# Compute lat/lon deltas
dlon = np.diff(lons)
dlon2d = np.ones((lats.size, dlon.size)) * dlon

dlat = np.diff(lats)
dlat2d = np.ones((dlat.size, lons.size)) * np.expand_dims(dlat, axis=1)

# Convert deltas to physical distances
earth_radius = data.LatLon_Projection.earth_radius * units.m

dx = earth_radius * dlon2d * np.expand_dims(np.cos(lats), axis=1)
dy = earth_radius * dlat2d

# Get the valid times from the file
vtimes = []
for t in range(data.time.size):
    vtimes.append(datetime.utcfromtimestamp(data.time[t].data.astype('O') / 1e9))

idx_500 = 18  # Selecting isobaric=500 returns the data from index 3; bug in xarray?
z = data['Geopotential_height_isobaric'][0, idx_500, :]

print(z[0,0])

lap = mpcalc.laplacian(z, deltas=(dy, dx), dim_order='yx')

generates this output:

<xarray.DataArray 'Geopotential_height_isobaric' ()>
array(5389.1934, dtype=float32)
Coordinates:
    lat       float32 90.0
    lon       float32 0.0
    reftime   datetime64[ns] ...
    time      datetime64[ns] 2018-07-23
    isobaric  float32 50000.0
Attributes:
    long_name:                      Geopotential height @ Isobaric surface
    units:                          gpm
    abbreviation:                   HGT
    grid_mapping:                   LatLon_Projection
    Grib_Variable_Id:               VAR_0-3-5_L100
    Grib2_Parameter:                [0 3 5]
    Grib2_Parameter_Discipline:     Meteorological products
    Grib2_Parameter_Category:       Mass
    Grib2_Parameter_Name:           Geopotential height
    Grib2_Level_Type:               100
    Grib2_Level_Desc:               Isobaric surface
    Grib2_Generating_Process_Type:  Forecast
---------------------------------------------------------------------------
UndefinedUnitError                        Traceback (most recent call last)
<ipython-input-22-780b9f53a409> in <module>()
----> 1 import codecs, os, ast;__pyfile = codecs.open('''/tmp/pyP8taUw''', encoding='''utf-8''');__code = __pyfile.read().encode('''utf-8''');__pyfile.close();os.remove('''/tmp/pyP8taUw''');__block = ast.parse(__code, '''/home/decker/src/git_repos/metpy/test/gfs_lap_bug.py''', mode='exec');__last = __block.body[-1];__isexpr = isinstance(__last,ast.Expr);__block.body.pop() if __isexpr else None;exec(compile(__block, '''/home/decker/src/git_repos/metpy/test/gfs_lap_bug.py''', mode='exec'));eval(compile(ast.Expression(__last.value), '''/home/decker/src/git_repos/metpy/test/gfs_lap_bug.py''', mode='eval')) if __isexpr else None

~/src/git_repos/metpy/test/gfs_lap_bug.py in <module>()
     45 print(z[0,0])
     46 
---> 47 lap = mpcalc.laplacian(z, deltas=(dy, dx), dim_order='yx')

~/src/git_repos/metpy/metpy/xarray.py in wrapper(*args, **kwargs)
    343     @functools.wraps(func)
    344     def wrapper(*args, **kwargs):
--> 345         args = tuple(a.metpy.unit_array if isinstance(a, xr.DataArray) else a for a in args)
    346         kwargs = {name: (v.metpy.unit_array if isinstance(v, xr.DataArray) else v)
    347                   for name, v in kwargs.items()}

~/src/git_repos/metpy/metpy/xarray.py in <genexpr>(.0)
    343     @functools.wraps(func)
    344     def wrapper(*args, **kwargs):
--> 345         args = tuple(a.metpy.unit_array if isinstance(a, xr.DataArray) else a for a in args)
    346         kwargs = {name: (v.metpy.unit_array if isinstance(v, xr.DataArray) else v)
    347                   for name, v in kwargs.items()}

~/src/git_repos/metpy/metpy/xarray.py in unit_array(self)
     31     def unit_array(self):
     32         """Return data values as a `pint.Quantity`."""
---> 33         return self._data_array.values * units(self._units)
     34 
     35     @unit_array.setter

~/local/anaconda3/envs/devel/lib/python3.6/site-packages/pint/registry.py in parse_expression(self, input_string, case_sensitive, **values)
    838         gen = tokenizer(input_string)
    839 
--> 840         return build_eval_tree(gen).evaluate(lambda x: self._eval_token(x,
    841                                                                         case_sensitive=case_sensitive,
    842                                                                         **values))

~/local/anaconda3/envs/devel/lib/python3.6/site-packages/pint/pint_eval.py in evaluate(self, define_op, bin_op, un_op)
     90         else:
     91             # single value
---> 92             return define_op(self.left)
     93 
     94 

~/local/anaconda3/envs/devel/lib/python3.6/site-packages/pint/registry.py in <lambda>(x)
    840         return build_eval_tree(gen).evaluate(lambda x: self._eval_token(x,
    841                                                                         case_sensitive=case_sensitive,
--> 842                                                                         **values))
    843 
    844     __call__ = parse_expression

~/local/anaconda3/envs/devel/lib/python3.6/site-packages/pint/registry.py in _eval_token(self, token, case_sensitive, **values)
    819             else:
    820                 return self.Quantity(1, UnitsContainer({self.get_name(token_text,
--> 821                                                                       case_sensitive=case_sensitive) : 1}))
    822         elif token_type == NUMBER:
    823             return ParserHelper.eval_token(token)

~/local/anaconda3/envs/devel/lib/python3.6/site-packages/pint/registry.py in get_name(self, name_or_alias, case_sensitive)
    473         candidates = self._dedup_candidates(self.parse_unit_name(name_or_alias, case_sensitive))
    474         if not candidates:
--> 475             raise UndefinedUnitError(name_or_alias)
    476         elif len(candidates) == 1:
    477             prefix, unit_name, _ = candidates[0]

UndefinedUnitError: 'gpm' is not defined in the unit registry

I expected the height units to be considered interchangeable with meters.

Python 3.6.5
0.8.0+53.g20f507e

@sgdecker
Copy link
Contributor Author

I guess I should say I would expect the code not to crash, as gpm and m are technically not the same.

@jthielen
Copy link
Collaborator

I've come across this quite often as well (with both gpm and various degrees_*). What I've done so far (as can be seen here) is implement a work-around by redefining the units, such as,

data['Geopotential_height_isobaric'].attrs['units'] = 'meters'

@dopplershift and @jrleeman, should we add these units that commonly come across in THREDDS converted files (gpm, degrees_north, etc.) to our pint registry?

@jrleeman
Copy link
Contributor

Does pint do anything with gpm? To me that's gallons/minute.

@dopplershift
Copy link
Member

Damn yankee--why not 'grams / minute'?

'gpm' is a bit of an abomination as it's not technically CF-compliant. One option is to add the units as aliases in pint, though that feels like a bit of a hack. Another option is to have parse_cf do some cleaning of the input metadata. Regardless, I'm not sure what to do about 'gpm'.

@jthielen
Copy link
Collaborator

If we add the following to units.py:

units.define('geopotential_meter = meter = gpm')

Then we get something like

>>> h = 5000 * units('gpm')
>>> h.units
<Unit('geopotential_meter')>
>>> h.to_base_units()
<Quantity(5000.0, 'meter')>

To me, this seems like the cleanest solution given how prevalent this bizarre unit is (but I could be convinced otherwise as well).

@dopplershift
Copy link
Member

That looks ok to me...but is it strictly correct?

@jthielen
Copy link
Collaborator

jthielen commented Jul 31, 2018

At least between (geometric) height and geopotential height, no, it is only approximately correct...here's the exact formula:

(Wallace and Hobbs 3.22)

But, in most cases in practice I've seen height and geopotential height assumed to be equal in the troposphere. Along with that, the unit I've usually seen for geopotential height is just regular meters.

If we're going with correctness of the unit entity itself, I would think so, since this defines a new unit geopotential_meter with abbreviation gpm that has a measure of 1 meter.

@lesserwhirls
Copy link
Contributor

*ducks and runs*

ducks

@dopplershift
Copy link
Member

So the downside of not just forcing it to meters is that when (if?) we use the unit attribute for labelling plots, the units on the plot will be "geopotential_meter".

@jrleeman
Copy link
Contributor

Having "meters" and "special meters" seems... silly, but that's a different rant. If the dimensionality is meters, then I think we stick to that. caveat emptor

@sgdecker
Copy link
Contributor Author

I think it is best to see the 'gpm' as essentially a marker that this distance is the result of a particular approximation. A common operation would be to multiply geopotential height by g0 to compute geopotential, and the result of that computation should have units of m2/s2 (or J/kg), with no gpm in sight.

Although I could see someone studying the mesosphere wanting to convert gpm to actual m, there is the extra complication that height differences will vary depending on z (i.e., a delta Z of 1 gpm is very close to a delta z of 1 m in the troposphere, but is something like a delta z of 0.97 m in the mesosphere, if I'm doing my math right). Or, in math terms, delta Z = (g/g0) delta z, where g is the actual gravitational acceleration (a function of height).

@jthielen
Copy link
Collaborator

jthielen commented Aug 2, 2018

After working on some more examples with this and looking at past discussions, I want to alter my previous comment on the correctness of 'geopotential_meter' as a unit. While I can definitely see usefulness in a marker/placeholder to represent that it isn't a true geometric length, I'm not really sure that the unit itself a good place to do that, since 'geopotential_meter' or 'gpm' isn't a proper unit in the sense of unit arithmetic (i.e. how can we know what we are multiplying by should or should not cancel out 'gpm' by units alone) or CF-compliance/UDUNITS compatibility.

So, I think I'd now prefer the solution of "fixing" the unit in parse_cf(), unless this is a change that can be made in THREDDS, which seems to be where this is all coming from.

(Also, I'm sorry for keeping on coming back to this, but it has been bothering me and I'd like to see a fix for it.)

@dopplershift dopplershift added Type: Enhancement Enhancement to existing functionality Area: Units Pertains to unit information labels Aug 2, 2018
@dopplershift dopplershift added this to the 0.9 milestone Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Units Pertains to unit information Type: Enhancement Enhancement to existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants