Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use PyProj instead of CartoPy for calculations #1455

Closed
dopplershift opened this issue Aug 9, 2020 · 3 comments · Fixed by #1483
Closed

Use PyProj instead of CartoPy for calculations #1455

dopplershift opened this issue Aug 9, 2020 · 3 comments · Fixed by #1483
Labels
Area: Calc Pertains to calculations Area: Cross-sections Pertains to making cross-sections through data Area: Projections Pertains to projecting coordinates between coordinate systems Area: Xarray Pertains to xarray integration
Milestone

Comments

@dopplershift
Copy link
Member

So the work done in #1454 has revealed that we depend on CartoPy...a lot. Now while I love CartoPy, I have concerns that we cannot do much of the XArray smarts without CartoPy installed. While parse_cf was initially focused on the projections and that made sense, now parse_cf also includes the auto-detection of dimensions; none of that relies necessarily on CartoPy, but you can't get that stuff without CartoPy installed.

My thinking is we should maybe move more things to use PyProj directly. Under the hood, CartoPy is making calls to PROJ anyway, so let's go closer to direct (i.e. python bindings for PROJ). This would allows us to not have our calculations rely on a graphics toolkit. We could consider making PyProj a required dependency, and be a whole lot more functional out of the box than we are without CartoPy currently.

PyProj, as of 2.2 (June 2019), includes creating projections from CF metadata so we could replicate pretty much all of what we're doing without adding a bunch more hand-coded mapping between CF and something. PyProj also has wheels on PyPI so they're not nearly as much a pain for users to install (and it's only dependency is a library CartoPy already relies on).

@dopplershift dopplershift added the Area: Calc Pertains to calculations label Aug 9, 2020
@dopplershift
Copy link
Member Author

dopplershift commented Aug 9, 2020

So a basic Python 3.8 conda environment with all our deps (numpy, scipy, matplotlib, pint, xarray, pandas, pooch, traitlets) + netCDF4 comes in at 427M on my Mac.

Installing PyProj addds proj and pyproj and comes in at 439M.

Installing CartoPy on top of that pulls in 38 other packages (including GDAL, postgresql, and, somehow, boost) and weighs in at 845M.

@dopplershift dopplershift added this to the 1.0 milestone Aug 9, 2020
@jthielen
Copy link
Collaborator

jthielen commented Aug 9, 2020

I would think that the xarray stuff except for cartopy_crs, assign_longitude_latitude and assign_y_x should all work without CartoPy installed if there wasn't that unconditional import at the top of xarray.py and plots/mapping.py.

That being said, I think this is a really good idea. CartoPy has been the coordinate transformation library of choice since it was the more likely to already be available and nicer to work with option back in 2018 when most the xarray in MetPy stuff started, but PyProj has matured a lot since then in all its 2.x versions. Also, with the xarray kinematics "magic" coming up soon in 1.0, PyProj would have also become an optional-but-essentially-required dependency anyways (due to frequent use of lat_lon_grid_deltas under the hood), so going to PyProj as a required dependency and dropping the use of CartoPy for anything other than plotting makes a lot of sense.

So, it looks like there will be yet another antecedent PR to get in before #1353's replacement!

A couple followup questions:

  • It's might be a bit of a stretch to require the latest version of PyProj (2.6.1, May 3, 2020), but it comes with a few small but important bug fixes for CF attribute handling and it has a working version of pyproj.proj.Proj.get_factors, which is something we've been wanting for a while for Generalize vorticity (and related functions) to allow for map projections #893. So, can we require 2.6.1, or just require 2.2 and then attempt to gracefully handle the bug cases and when get_factors is unavailable?
    • Not sure if it impacts the decision at all, but 3.0.0 looks imminent, so 2.6.x is likely the last of the 2.x releases
  • Should we change the scalar crs coordinate added by parse_cf and assign_crs to be a pyproj.CRS rather than metpy.plots.mapping.CFProjection?
    • This shouldn't change most end-user use cases (almost all examples use .metpy.cartopy_crs, in which we'd still return a CartoPy CRS), and would save re-constructing the CRS object whenever we need to use it.
    • This also lines us up with geoxarray, if that project ever gets going again.
    • However, until CartoPy has direct handling of PyProj CRS (or the pending conversion PR get resolved WIP: Add from_proj4 function to create CRS from PROJ.4 definitions SciTools/cartopy#1023), we'd still have to keep CFProjection around in order to get our CartoPy CRS. Though, if we do make the crs coordinate a pyproj.CRS, then CFProjection should probably be made private API so that it is easier to remove once it is no longer needed.
  • Would it make sense to have some way of caching the pyproj.proj.Proj created from the CRS? It will be reused a lot, but I'm not sure about how expensive it is to create. Perhaps as an attribute on the scalar crs coordinate would make sense if it is decided to do so.

@jthielen jthielen added Area: Cross-sections Pertains to making cross-sections through data Area: Projections Pertains to projecting coordinates between coordinate systems Area: Xarray Pertains to xarray integration labels Aug 9, 2020
@jthielen
Copy link
Collaborator

Some notes from today's telecon for things to target on this for 1.0:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Calc Pertains to calculations Area: Cross-sections Pertains to making cross-sections through data Area: Projections Pertains to projecting coordinates between coordinate systems Area: Xarray Pertains to xarray integration
Projects
None yet
2 participants