Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to pass coordinates to the grid method #326

Merged
merged 19 commits into from Nov 3, 2021

Conversation

santisoler
Copy link
Member

@santisoler santisoler commented May 19, 2021

Allow the grid method of BaseGridder to take the coordinates of the
preexisting grid either as 2d arrays (meshgrid) or as 1d arrays. Raise
a FutureWarning if the spacing, shape or region arguments are passed,
since they will be removed in Verde v2.0.0. Write a new meshgrid_from_1d
function that creates a meshgrid out of 1d horizontal arrays, but also checks
if they have a single dimensions and supports for passing extra coordinates as
2d-arrays.

This PR is an evolution of #285, so #285 should be closed if this PR is merged.

Reminders:

  • Run make format and make check to make sure the code follows the style guide.
  • Add tests for new features or tests that would have caught the bug that you're fixing.
  • Add new public functions/methods/classes to doc/api/index.rst and the base __init__.py file for the package.
  • Write detailed docstrings for all functions/classes/methods. It often helps to design better code if you write the docstrings first.
  • If adding new functionality, add an example to the docstring, gallery, and/or tutorials.
  • Add your full name, affiliation, and ORCID (optional) to the AUTHORS.md file (if you haven't already) in case you'd like to be listed as an author on the Zenodo archive of the next release.

Start drafting the addition of an optional coordinates argument to the
grid method of the BaseGridder.
It would allow use to predict on preexisting grids and obtain
a xr.Dataset as output.
Add a new example that passes the coordinates of a preexisting grid to
a Chain with a Spline inside.
Write a new meshgrid_from_1d function that creates a meshgrid out of 1d
horizontal arrays, but also checks if they have a single dimensions and
supports for passing extra coordinates as 2d-arrays.
@santisoler santisoler changed the title WIP Add option to pass coordinates to the grid method Add option to pass coordinates to the grid method Jun 17, 2021
Copy link
Member

@leouieda leouieda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few suggestions but nothing major. Really good work, as always 🙂

examples/predict_on_grid_coordinates.py Outdated Show resolved Hide resolved
coordinates = (easting, northing)

# Use the .grid() method to predict on the grid coordinates
grid = gridder.grid(coordinates)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using coordinates for the grid coordinates is going to be confusing. Probably best to use coordinates for data coordinates and grid_coordinates for grid coordinates throughout the code and examples.

Copy link
Member Author

@santisoler santisoler Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agreed at first, but then I found that we already use grid_coordinates for the function that creates the coords of a regular grid. Should we use a different name for the coordinates argument? Maybe grid_coords?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. grid_coords is reasonable I guess

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having second thoughts on this. Is really an issue of having coordinates argument on methods like predict and in grid while the requirements are slightly different? I think having a common coordinates arguments for methods and functions in Verde is a feature and not an issue:

  • I can autocomplete on every call by just typing coor<TAB> (with grid_coords it wouldn't work, I would need to delete and start typing gri).
  • On some places it doesn't mind if the passed coordinates are scattered or regular gridded (forward modelling in Harmonica for instance). So those cases would be an exception where coordinates should be used, although grid coordinates could also be passed.
  • On the specific BaseGridder we would have two methods that do almost the same but for different kind of coordinates: predict for scattered coordiantes, grid for grid coordinates. I think that difference is sufficient for differentiating the distribution of the coords.

One last argument, if we use grid_coords, an simple example of the grid method would look like:

grid = gridder.grid(grid_coords=...)

I can see myself struggling to read that line out loud when teaching.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely right. Typing spline.grid(grid_coords...) is awkward. So here is a proposal:

  1. Function arguments are coordinates unless you need to specify 2 sets (like distance_mask).
  2. For examples: coordinates means data coordinates and grid_coords means grid coordinates since there will usually be the 2 sets and we'll need to distinguish between them. That way the call could be spline.grid(coordinates=grid_coords... or spline.grid(coordinates=vd.grid_coordinates...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I write this, I'm not entire happy with requiring giving coordinates all the time. But I don't know what else to do. I would want to write some examples using this syntax to see how it feels but my feeling is that it will be a bit of a blow to general usability.

Copy link
Member Author

@santisoler santisoler Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the two points above. Just a reminder for the future: don't underestimate the old design decisions, there might be a reason why we took that road even if we don't remember it.

Feel free to try it out on examples, there's no rush for this PR. If we are going to change the default behavior of the base class for Verde gridders, it's good to put a lot of thinking into it.

On the note about passing coordinates every time I think we are already doing it, but hiding it through the spacing, shape, region, projection arguments. So the grid method currently does two things: creates a set of grid coordinates and predicts on them. I would like to split it in two:

  1. Leave the coordinate generation to the user. This way they can corroborate that the grid is the right one before they have to wait until the prediction is carried out. Or they can just use a predefined grid and skip this step.
  2. Use the grid method to predict on the given coordinates. It should also project the coords before running the prediction if any projection is passed. This simplifies the usage of the grid method, makes it easier to generalize to gridders like EQL with easting, northing and height (they can be part of the coordinates directly) and makes it easier to understand what the projection callback will be used for (I always struggle to think if the projection is applied before the prediction or after it and which coordinates should I use for region and spacing).

Finally, I think this new grid method will be doing a single task, and doing it well 😁 .

verde/base/base_classes.py Outdated Show resolved Hide resolved
verde/base/base_classes.py Outdated Show resolved Hide resolved
verde/utils.py Outdated Show resolved Hide resolved
verde/utils.py Outdated Show resolved Hide resolved
santisoler and others added 5 commits August 2, 2021 09:18
Co-authored-by: Leonardo Uieda <leouieda@gmail.com>
We will change every gridding example so they predefine grid coordinates
and pass them as the `coordinates` argument.
@santisoler
Copy link
Member Author

@leouieda I've added a FutureWarning if the region, spacing or shape arguments are passed, in agreement with our last Developer Meeting. Let me know if we need to change something else here. After this PR is merged, we should start changing the current examples so they create the grid coordinates right before calling the grid method.

Copy link
Member

@leouieda leouieda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small comments on the docstring. Otherwise, this looks good to me 👍🏽

verde/base/base_classes.py Outdated Show resolved Hide resolved
should be a 2d dimensional grid with a shape of ``(northing.shape,
easting.shape)``. The arrays could either be Numpy arrays or
:class:`xarray.DataArray`s. If ``coordinates`` are passed, no
``region``, ``shape`` nor ``spacing`` is needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the part about the extra coordinates since in Verde itself they are ignored. Other libraries may implement this differently as well so it’s best to leave unspecified.

The arrays could be all sorts of things (pandas Series, cupy, dask, etc) so saying “array” is best without needing to specify types.

“If coordinates are passes, region, shape, and spacing are ignored.”

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you like it better now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants