-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use xugrid in mesh methods #535
Conversation
Hi @Huite I started to use the xugrid Regridder to resample regular gridded to mesh (I was using mesh to geopandas.GeoDataFrame and then zonal stats before). With the Regridder, our tests are now failing on timeout as they run for too long. I did notice that the first time I used any of the regrid methods it takes very long but if I run the tests a second or third time it goes much faster. Is that something you recognize? |
Hi @hboisgon, Yes, I recognize this. There are two reasons for latency:
So I'm almost certain you're running into the first issue. How long does it take, exactly? And how long are you allowing it? There's another option, but I don't think it does any good. Numba can also do fully ahead of time compilation (AOT), if you specify all function signatures. I can do this for numba_celltree, because I know exactly what will end up going in every function (basically only 64-bit integers and 64-bit floats). However, this just moves the compilation to the setup phase, rather than during runtime. I don't think your total CI time will improve. Another option is just using Numba as a AOT compiler, and providing wheels. But in that case, I (almost) might as well use C++ or Fortran (or Rust) instead. For me, the primary benefits of numba is not having to distribute binaries! Alternatively, you could speed up testing a lot (potentially), if you would be able to re-use environments in CI workflows. I've investigated in the past, and it was hopeless, but maybe things have changed. |
There's another option, which is that I tell Numba to optimize less. I'm currently inlining relatively many functions in numba_celltree. This allows the compiler to optimize more aggressively and results in relevant speedups (e.g. 30-40%), but also doubles compilation time. As mentioned, in "normal use", this seems worth it since most users will incur the compilation time only once every few months or so, and then in their day to day use everything just goes 30% faster. That seems like it's well worth it -- but for CI it's different, because it's incurring the compilation cost every time. |
Best to be concrete of course... @Hofer-Julian Had a good suggestion here: use a fixture to force the compilation beforehand. That way, you still get the useful time out messages from pytest-timeout. It will still increase your CI time notably of course which is kinda meh, but installing always takes at least a few minutes anyway... You can get a session fixture as follows: @pytest.fixture(scope="session", autouse=True) |
I'm working on a better Ci setup that can make better uses of the chaches. This might help resolve this, though nothing is guaranteed, just FYI |
@Huite Do you know where numba_celltree stores it's artifacts? (so I can add them to the CI cache) |
I've never looked at it very carefully, but I think it's stored in the pycache for the package installation. On the my windows machine, that's: There's a number of files there with suffixes nbi and nbc (suggesting numba). At first glance, some sizes seem plausible for executable programs. |
Thanks for the input @Huite and tests @savente93 I've tried a couple of options. By forcing the compilation beforehand with I see two possible solutions:
For the first solution we currently run into an issue with pyflwdir which requires some maintenance, see test log. I'll see if I can get that sorted in Deltares/pyflwdir#36 after which we can decide for the best solution here. |
@savente93 can you review my proposed solution for the test timeout using The new code from @hboisgon looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this solves the problem, I'm very happy with the solution. I have one small comment which may or may not be valid. If it isn't feel free to just ignore it and merged. Thank you for solving this :)
Issue addressed
Fixes #420
Explanation
Replaced using zonal stats in mesh2d generic setup methods by xugrid Regridder. Also moved the functions to mesh worklows for easy re-use.
Checklist
main