Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to pygeoprocessing 2.4.2 #1437

Merged
merged 10 commits into from
Nov 3, 2023

Conversation

emlys
Copy link
Member

@emlys emlys commented Oct 25, 2023

Description

Fixes #1421

enhancements

Replaced raster_calculator with raster_map in places where it simplified things.
There are still many places where raster_calculator is needed, mostly related to nodata masking. I didn't write full docstrings for the simple one-line functions that are passed to raster_map, hoping that they can be replaced with lambdas in the near future.

Replaced kernel creation functions with pygeoprocessing.kernels where possible.

Replaced utils.array_equals_nodata with pygeoprocessing.array_equals_nodata.

changes to tests

A few expected values in the UCM tests had to be updated because of changes to zonal_statistics. Out of almost 1500 building polygons in the test data, zonal stats results were different (and more correct) for 7 of them. I am guessing because we used to reproject the raster to match the vector; now we reproject the vector to match the raster, which should be more accurate. This probably only showed up in UCM because it uses so many polygons and they are small relative to the pixel size.

In one habitat quality test, we were incorrectly creating a raster from an int8 array and then inserting the value 255 into it. Previously this worked because pygeoprocessing incorrectly created a uint8 raster from the int8 array. Now pygeoprocessing copies over the type correctly, and so the 255 became a -1. Fixed by updating the original array dtype.

Checklist

  • Updated HISTORY.rst and link to any relevant issue (if these changes are user-facing)
  • Updated the user's guide (if needed)
  • Tested the Workbench UI (if relevant)

@@ -298,17 +298,6 @@
"filtered_[THREAT]_aligned.tif": {
"about": "Filtered threat raster",
"bands": {1: {"type": "ratio"}},
},
"kernels": {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not actually used anywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this looks like it's left over from the old convolution-based workflow we recently reverted. Thanks for catching this!

@emlys emlys requested a review from phargogh November 2, 2023 16:52
@emlys emlys self-assigned this Nov 2, 2023
@emlys emlys marked this pull request as ready for review November 2, 2023 16:53
Copy link
Member

@phargogh phargogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thank you for this big update, @emlys ! It is so nice to see all of the array masking logic abstracted away to raster_map.

I just had two questions for you about how the model's behavior was changing in UCM and in Pollination that were resulting in updated regression values. It might save us some headaches (and forums questions!) down the road if we could identify and document in HISTORY what's changing that results in different outputs.

args=(storage_path_list, file_registry[output_key]),
func=pygeoprocessing.raster_map,
kwargs=dict(
op=sum_op,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, right! This would be a great place to use functools.partial, but we can't because it doesn't have a __name__. natcap/taskgraph#83

@@ -298,17 +298,6 @@
"filtered_[THREAT]_aligned.tif": {
"about": "Filtered threat raster",
"bands": {1: {"type": "ratio"}},
},
"kernels": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this looks like it's left over from the old convolution-based workflow we recently reverted. Thanks for catching this!

'avd_eng_cn': 3520213.280928277,
'avd_eng_cn': 3520217.313878,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these values changing because of a change in how we are masking? If the behavior is changing, I wonder if we should note that change in HISTORY

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These had to be updated because of changes to zonal_statistics. Out of almost 1500 building polygons in the test data, zonal stats results were different (and more correct) for 7 of them. I am guessing because we used to reproject the raster to match the vector; now we reproject the vector to match the raster, which should be more accurate. This probably only showed up in UCM because it uses so many polygons and they are small relative to the pixel size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a general note in history.

self.assertAlmostEqual(result_sum, 58.669518, places=2)
self.assertAlmostEqual(result_sum, 58.407155, places=2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changing because of the change in how we're building kernels? Or is it more about masking? Either way, it might be useful to note in the HISTORY why values are changing slightly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because of the switch to pygeoprocessing.kernels, which is more correct in some cases. The kernel implementation in pollination didn't always produce symmetrical kernels:
image
With pygeoprocessing.kernels:
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note to history.

@emlys emlys requested a review from phargogh November 3, 2023 17:38
Copy link
Member

@phargogh phargogh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@phargogh phargogh merged commit 03d3c45 into natcap:main Nov 3, 2023
25 checks passed
@emlys emlys deleted the task/update-pygeoprocessing-2.4.2 branch October 3, 2024 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Migrate to pygeoprocessing kernel library
2 participants