Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite simple and tiepoint modis interpolation in cython #38

Merged
merged 93 commits into from
Oct 25, 2022

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Mar 19, 2022

This is a continuation of #36 but includes mostly changes to the simple modis interpolation since that PR. Doing this though required restructuring of the cython modules so I added a _modis_utils.py. The results of a simple test with some dask array based processing:

image

This is about half the memory and half the processing time of a previous execution with the non-cython version. It also runs faster in the pure numpy case of a full granule, but not as fast as the #36 angle-based interpolation with pure numpy. So to put it another way, simple interpolation is faster and uses less memory than #36 if used with small and/or dask-based arrays, but it is slower than #36 for larger numpy arrays.

  • Closes #xxxx
  • Tests added
  • Tests passed
  • Passes git diff origin/main **/*py | flake8 --diff
  • Fully documented

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, makes it much easier to follow with split functions! just one inline question.

.github/workflows/ci.yaml Show resolved Hide resolved
@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

@mraspaud I think this is ready for re-review. Refactoring definitely fixed some bugs that were just "lucky" that they didn't break the final results.

I'll make more PRs in the future when the coveralls github action is further along and merged with the upstream github action. I will merge this after I do some final performance checks to make sure my refactoring didn't break anything.

@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

Here is creation of a true_color with geotiepoints main:

image

And here is this branch:

image

So 25s faster, less memory, and faster processing. Note this is computing the true_color and throwing away the chunks. I'm not exactly sure why the new version shows a near constant 8GB of memory usage at the second half of processing, but that isn't the interpolation part of the processing anyway.

@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

And here's what that looks like with this PR just before my latest refactorings:

image

So the graph in my last comment (how this PR is now) is even faster and more memory efficient than before. Amazing.

@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

Nnnoooo...the true_color geotiff is bad! It may be a nearest neighbor thing, let me check.

@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

False alarm, it is a bad default setting in the EWA resampler in pyresample. Separate PR coming for that in pyresample.

@djhoese
Copy link
Member Author

djhoese commented Oct 24, 2022

So bottom line this PR makes this code better in every way. Let's just merge it. No need to review it. 😉

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple refactoring requests

geotiepoints/_modis_interpolator.pyx Outdated Show resolved Hide resolved
geotiepoints/_modis_interpolator.pyx Outdated Show resolved Hide resolved
@mraspaud
Copy link
Member

Regarding the performance results: did you test both modis_interpolator and simple_modis_interpolator? I think you presented just one result, which one is it?

@djhoese
Copy link
Member Author

djhoese commented Oct 25, 2022

Regarding the performance results: did you test both modis_interpolator and simple_modis_interpolator? I think you presented just one result, which one is it?

Good question. I realize now that this PR title is misleading as this PR has grown. I'll update that. The most recent profiling was all using Satpy's defaults which for the 1km data I was using is to use the modis_interpolator and not the simple. There were results of the simple interpolator somewhere else but now I can't find them. I'll do some more profiling after I clean up the code based on your suggestions.

Note I realized after doing it that some of these results were likely with Cython profiling/tracing enabled so likely slower than they will be in production.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@djhoese
Copy link
Member Author

djhoese commented Oct 25, 2022

main tiepoint interpolation:

image

main simple:

image

This PR tiepoint:

image

This PR simple:

image

So faster and more memory efficient, but some of these results seemed heavily dependent on what the rest of my system was doing. The simple interpolation definitely doesn't benefit from this work as much as the tiepoint-based interpolation does as it uses scipy to do the majority of the interpolation/extrapolation. I'm still very surprised by the simple interpolation in main. It is almost identical to the cython one for this PR.

@djhoese
Copy link
Member Author

djhoese commented Oct 25, 2022

The other thing to keep in mind with the above plots is that the interpolation is only the first 10 seconds or so since I am also doing EWA resampling and then computation of the arrays. So the peak memory at the right side of the plot is an EWA problem not a geotiepoint/modis interpolation problem.

@djhoese djhoese merged commit 1f0ad49 into pytroll:main Oct 25, 2022
@djhoese djhoese deleted the optimize-simple-modis-cython branch October 25, 2022 15:46
@djhoese djhoese changed the title Rewrite simple modis interpolation in cython Rewrite simple and tiepoint modis interpolation in cython Oct 25, 2022
@mraspaud
Copy link
Member

Great work, thanks for wrapping this up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants