Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent arguments for gradient and advection #896

Closed
sgdecker opened this issue Jul 24, 2018 · 5 comments · Fixed by #1490
Closed

Inconsistent arguments for gradient and advection #896

sgdecker opened this issue Jul 24, 2018 · 5 comments · Fixed by #1490
Labels
Area: Calc Pertains to calculations Type: API Change Changes to how existing functionality works Type: Maintenance Updates and clean ups (but not wrong)

Comments

@sgdecker
Copy link
Contributor

The following code:

import numpy as np
import metpy.calc as mpcalc
from metpy.units import units

T = np.array(([1, 2, 3],
              [4, 4, 4],
              [1, 3, 7],
              [0, 2, 6])) * units.K
u = np.array(([5, 4, 4],
              [6, 6, 6],
              [6, 7, 6],
              [5, 5, 5])) * units.m/units.s
v = np.array(([4, 4, 4],
              [3, 2, 3],
              [1, 2, 3],
              [0, 1, 1])) * units.m/units.s

dx = np.array(([12, 10],
               [12, 10],
               [10, 8],
               [10, 8])) * units.m
dy = np.array(([6, 5, 5],
               [5, 5, 4],
               [5, 4, 4])) * units.m

np.set_printoptions(precision=4)

dTdy, dTdx = mpcalc.gradient(T, deltas=(dy, dx))
print(dTdx)
print(dTdy)

adv = mpcalc.advection(T, wind=(u, v), deltas=(dx, dy), dim_order='yx')
print(adv)

accurately computes the temperature gradient and temperature advection corresponding to the sample data provided in 'yx' order.

However, notice that doing this calculation properly requires providing the deltas as (dy, dx) in the gradient call, but (dx, dy) in the advection call. I would have expected the order to be the same for both calls. In particular, since the data is in yx order, I would expect the correct call to advection would include deltas=(dy, dx), but currently that is not the case.

A second inconsistency involves the dim_order argument: The gradient function doesn't use it, but omitting it from advection results in a stern warning from MetPy. I would expect:

  1. No warning from MetPy when dim_order is omitted in the advection call, and/or
  2. gradient to allow for the dim_order to be specified as well, interpreted the same way as in advection.

Version Info

Python 3.6.5
0.8.0+32.gb19c052

@sgdecker
Copy link
Contributor Author

sgdecker commented Jul 25, 2018

Upon further reflection, I think the cleanest, most intuitive API for these functions would include the following constraints:

  1. Grid storage order is an implementation detail fully encapsulated by the dim_order argument (which, if absent, means 'yx' order is assumed)
  2. The arrays returned by these functions always use the same dim_order as specified for the input (not a change from how things currently are, but included for completeness)
  3. The deltas argument, if supplied, always puts delta_x first, regardless of dim_order
  4. The gradient function always returns the x-derivative first, regardless of dim_order (After all, in the classroom, the gradient is usually written in Cartesian coords as (ds/dx, ds/dy) or ds/dx i + ds/dy j.)

I think these constraints would provide the cleanest interface in a typical workflow where the user reads in some netCDF data, computes some of these diagnostics, and makes some plots. The user would not even need to worry about dim_order in this workflow; he/she would only need to know that things involving delta-x go in first, or come out first (which is the "obvious" order).

Given these criteria, the interface for advection is OK in the code example above (although at some point it makes sense to remove the warning if dim_order is omitted), and it is the interface to gradient that needs work. The proper call should be:
dTdx, dTdy = mpcalc.gradient(T, deltas=(dx, dy)) (with dim_order='yx' optionally included)

@dopplershift
Copy link
Member

Thank you for the thoughtful comments digging into this, it really helps laying it all out like that. Responses to the list:

  1. Agreed. With the next release (0.9, will be released in August), 'yx' ordering is assumed and it no longer warns if you do not specify it.
  2. Yes
  3. This appears to be how things operate, though the docs (at least on advection) are horribly unclear on that. Definitely need to improve the docs on this.
  4. See below

From my perspective, the fundamental difference between advection and gradient is that advection is a physically-aware calculation, meaning it knows about different dimensions like x and y and u and v wind components, while gradient only knows that it gets N-dimensional arrays and (by default) needs to calculate N finite-difference derivative estimates along each of the N dimensions.

I agree that the semantic differences between advection and gradient are problematic, especially when trying to teach students. But let me throw another wrench into the works here: up until version 0.7, we didn't even have our own gradient function--we relied on numpy's. numpy.gradient has the same semantics as we currently have in MetPy. So if we change the order of metpy.calc.gradient to return (dx, dy, dz), are we creating new confusion?

A lot of this will hopefully get simpler when we completely support xarray, so that dx, etc. arguments can go away. I am 100% in favor of making sure MetPy is consistent and as easy to use as possible.

FYI, one work-around to the gradient confusion is to use first_derivative explicitly for each of the desired dimensions.

@sgdecker
Copy link
Contributor Author

I am confused by your reponse to point 3, since in my example the call to gradient requires supplying the deltas in the opposite order to what I suggest is intuitive. This is in my mind the messiest part of the current situation, where the deltas argument has to be supplied in two different orders in my example code for the correct calculations. But it sounds like that may disappear with full xarray support.

Regarding point 4, maybe gradient should become _gradient (for internal use by other functions), and a new gradient that always returns the x-derivative first should be what appears in the public API. But I agree that it may be better to wait till the xarray support is complete to revisit this.

@dopplershift
Copy link
Member

Sorry, for (3), deltas are x-first, except for gradient and laplacian.

Yes, xarray should make all of this moot in general, though I'd like things to function even if you're not using it. So we may want to make some changes in the future.

Thanks for all of the thoughtful feedback!

@dopplershift dopplershift added Type: Maintenance Updates and clean ups (but not wrong) Area: Calc Pertains to calculations Type: API Change Changes to how existing functionality works labels Aug 2, 2018
@sgdecker
Copy link
Contributor Author

sgdecker commented Jan 6, 2019

An added note that laplacian() also has the confusing ordering for the deltas argument. If you think about what's happening under the hood, I guess it makes sense, but fragments like:

# 700-hPa Temperature Advection
tadv_700 = mpcalc.advection(tmpk_700s, (uwnd_700s, vwnd_700s), (dx, dy),
                            dim_order='yx').to_base_units()
# Laplacian of Temperature Advection
lap_tadv_700 = mpcalc.laplacian(tadv_700, deltas=(dy, dx))

just aren't intuitive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Calc Pertains to calculations Type: API Change Changes to how existing functionality works Type: Maintenance Updates and clean ups (but not wrong)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants