Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interpolation update #85

Closed
wants to merge 60 commits into from
Closed

Interpolation update #85

wants to merge 60 commits into from

Conversation

jay-hennen
Copy link
Contributor

Most of PR #78 can still describe this. However, there are a few important additions:

  1. Automatic slicing: When doing interpolation, only the data within the bounding box of the points will be requested, unless otherwise specified. This minimum request slicing applies to all variables and grid variables. Now, the ONLY information that needs to be 100% loaded before these interpolation routines can be run are the 'node' coordinates (for the cell tree)
  2. Highly flexible functions. Interpolation can be as simple as this:
    sgrid.interpolate_var_to_points(points, sgrid.u, slices=[time_idx, v_idx])
    The variable in this case is lazy-loaded and sliced appropriately. There are a variety of other possible uses:
    sgrid.interpolate_var_to_points(points, sgrid.u[time_idx, v_idx])
    This pre-loads the whole variable slice
    sgrid.interpolate_var_to_points(points, nc.variables['u'], slices=[time_idx, depth_idx])
    You can use a raw variable or array-like of any kind, as long as it can be sliced, and has a shape that can fit on one of the grids.
    sgrid.interpolate_var_to_points(points, sgrid.u, indices=ind, alphas=alphas, slices=[time_idx, depth_idx])
    Considerable performance gains can be had by passing in pre-computed information, such as indices, alphas, or even translated indices (advanced). To see an example of this, you can see the interpolation function in demos/matlabanim.py.

Much of the code has stablilized. There are still improvements to be made, but no planned API or algorithm changes/additions.

@ChrisBarker-NOAA
Copy link
Contributor

NOTE: tests failing due to lack of cellTree@d pacckage. That is available on the NOAA-ORR-ERD anaconda channel:

https://anaconda.org/noaa-orr-erd

(at least for py2 -- py3 shouldn't be a big deal, but we haven't had a reason to do it)

We should probably put it up on PyPi too -- it is a self contained build, so should be pip-installable.

These will all need to be addressed before being merged.

@jay-hennen
Copy link
Contributor Author

cell_tree2d is now up on pypi.

@jay-hennen
Copy link
Contributor Author

@ocefpaf
Copy link
Member

ocefpaf commented Mar 3, 2016

@jay-hennen I am back and I will review this PR. I am also working on getting cell_tree2d on the IOOS conda channel (see ioos/conda-recipes#632) so we can get the tests going.

@ocefpaf ocefpaf self-assigned this Mar 3, 2016
@jay-hennen
Copy link
Contributor Author

Great! Thanks Filipe.

@ChrisBarker-NOAA
Copy link
Contributor

Note that we've got cellTree2d up on Pypi now. It requires a compiler (and we haven't put any binary wheels up), but the only dependencies are numpy and cython, so a lot of people should be able to pip install it out of the box.

(hmm, maybe we should ship the cyton generated source, so that's not a build dependency)

And this is weird -- if I do "pip install cell_tree2d", it finds it an downloads it. But if I go to the PYPi web site and search for it, I can't find it...

@ChrisBarker-NOAA
Copy link
Contributor

ping!

We need this functionality, and we'd rather not have to continue to work off a fork.

I'm pretty sure that this doesn't change the API, and no one will now it's there unless they want to use it :-)

It does introduce a new dependency, though, which is optional -- i.e. only gets imported if you need it. But we do need it for tests.

@ocefpaf
Copy link
Member

ocefpaf commented Mar 8, 2016

@jay-hennen cell_tree2d landed on the IOOS channel. Can you adjust this PR to run the tests?

@ocefpaf
Copy link
Member

ocefpaf commented Mar 8, 2016

I'm pretty sure that this doesn't change the API

pysgrid is new and API changes are OK.

and no one will now it's there unless they want to use it :-)

That argument is not 100% valid when talking about community software. Code health is important to attract users and contributors. That said, there is not code health issue with your additions, just failing due to the lack of the cell_tree2d package. As soon as you fixe that I will review and merge.

@jay-hennen
Copy link
Contributor Author

jay-hennen commented May 3, 2016

I found that the travis CI is failing because of differences in how masked arrays are dealt with (Travis is actually doing the correct thing). I'm going to look into cutting out masked arrays where possible anyway. They cause a lot of frustration and are a rather big performance hit.

Edit: How can I confirm that cell_tree2d is being properly installed by Travis? Nothing in this PR will work properly without that.

@ChrisBarker-NOAA
Copy link
Contributor

For cell_tree2d, maybe make sure there is a test or two that directly tests using cell_tree2d without much else. Then you can see if those are passing. Not really what should get tested by a lib using it, but it would be a way to test the environment.

Alternatively, we could make sure that the cell_tree2d tests are installed with the package, and I think we could add a line to run that test on Travis after instaling it.

jay-hennen and others added 5 commits June 27, 2016 14:41
DANGER: it segfaults seemingly randomly -- probably celltree2d causing it.
at least with cell_tree2d v. 0.1.3 and master.
@ayan-usgs
Copy link
Contributor

ayan-usgs commented Jul 18, 2016

This PR has been going on for awhile now. Where exactly are we with the cell_tree2d issues? While .travis.yml appears to be going for the conda package and previous comments report that the install problems have been fixed, the most recent travis run appears to have failed due to installation problems.

I think the package is looking for a newer version of cell_tree2d than 0.1.1 version in the IOOS organization.

@ChrisBarker-NOAA
Copy link
Contributor

I was hoping to get this cleaned up at the script sprints, but ended
up getting semi-random segfaults when running the tests. Probably due
to netcdf :-(

While a segfault is by definition a bug, we may be able to work around
it by re-structuring the tests to be really careful about netcdf
closing, etc. it can't be this unstable in general use.

I think Filipe is going to try to track that down.

  • CHB

@ocefpaf ocefpaf removed their assignment Jul 18, 2016
@ocefpaf
Copy link
Member

ocefpaf commented Jul 20, 2016

I was hoping to get this cleaned up at the script sprints, but ended
up getting semi-random segfaults when running the tests. Probably due
to netcdf :-(

@ChrisBarker-NOAA I can run all the tests with the pinned version we have on Travis-CI.
The failure here was not related to that. Now it is the lack of the latest cell_tree2d as @jay-hennen updated the requirements. But that can be solved by changing conda config --add channels ioos -f to conda config --add channels conda-forge -f

While a segfault is by definition a bug, we may be able to work around it by re-structuring the tests to be really careful about netcdfclosing, etc. it can't be this unstable in general use.

I cannot find any place where that happens. @ayan-usgs used with statements everywhere and he closes and removes the files properly.

I think Filipe is going to try to track that down.

I am changing it to use pytest.fixtures regardless in #92. Maybe that will help us move to the latest netCDF4-python version.

ocefpaf added a commit to ocefpaf/pysgrid that referenced this pull request Jul 21, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants