New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove per-feature logging to improve performance issue#685 #718
Conversation
8 similar comments
Looks like the GDAL trunk is using PROJ.6 now: |
@snowman2 wow, great! I think I'll be able to have some review time tomorrow. |
Another idea I had recently was to change: Line 320 in 013dec9
To instead iterate over: collection.schema['properties'] And set the property to use: set_field_null(cogr_feature, i) If the property does not exist in This would eliminate the need for this check: Line 1167 in 013dec9
But, I wanted to run the idea by you first. |
@@ -283,7 +283,7 @@ cdef class OGRGeomBuilder: | |||
coordinates = geometry.get('geometries') | |||
return self._buildGeometryCollection(coordinates) | |||
else: | |||
raise ValueError("Unsupported geometry type %s" % typename) | |||
raise UnsupportedGeometryTypeError("Unsupported geometry type %s" % typename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
log.warning("Failed to encode %s using %s codec", key, encoding) | ||
key_bytes = ogr_key | ||
# log.debug("Normalizing schema type for key %r in schema %r to %r", key, collection.schema['properties'], schema_type) | ||
key_bytes = strencode(ogr_key, encoding) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a suggestion inline
@snowman2 what would you think about holding off on the schema/property checks for now? I think it's separable from elimination of the debug statements, which is a big win by itself. I'm inclined to remove the debug statements that you've commented. I put them in when building the original feature and I don't think they are necessarily needed anymore. |
I removed those in a separate commit.
I checked the difference between checking the properties and the alternative method and I was surprised that it didn't make a significant difference. So, I agree that it is an unnecessary change. It makes me wonder if it would be a good idea to do a check and remove all unnecessary log.debug/log.info statements in rasterio & fiona for performance reasons? |
@snorfalorpagus I tagged you for review of this, too. |
I edited the PR title to reflect our decision to take a one thing at a time approach to optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
I saw the issue #685, and so I took a stab at it based on the discussion to see if I could help move things forward.
I commented out the debug print statements instead of deleting them so that you don't have to re-create them when you are debugging (and an official decision on what to do with them has not been reached as far as I know). However, that presents the danger of being re-introduced by accident when committing changes after debugging.
Using this example:
And timing it:
Before:
1min 3s ± 2.32 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
After:
15.3 s ± 1.13 s per loop (mean ± std. dev. of 7 runs, 1 loop each)