New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Creating a Fiona Reader for shapefiles #1000
Conversation
lib/cartopy/io/shapereader.py
Outdated
self._data = [] | ||
|
||
with fiona.open(filename) as f: | ||
crs = f.crs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'crs' is assigned to but never used
Awesome. Thanks for doing this @greglucas. I've long wanted to benefit from the performance that OGR/Fiona brings. Out of interest, did you consider making this a Feature? https://github.com/SciTools/cartopy/blob/master/lib/cartopy/feature.py#L147 How have you made use of this reader so far? If it is simply to be able to put it onto a matplotlib based GeoAxes, I'd be very tempted to look at following the Feature route. If it is to inspect and interact with the shapefile metadata you've put the code in the right place. If you haven't already, would you mind filling in the CLA at http://scitools.org.uk/governance.html#contributors. Thanks in advance. |
@pelson, The main reason I needed this was for plotting the fine scale (10m) ocean features faster. So, you're correct (for me at least) it is just being able to plot background/features on a GeoAxes. As for making it a Feature, I'm not sure what this would entail as there is currently no 'reader' within that class. The real benefit from this is in the reading of the shapefiles themselves to create the geometries. In #250 @shoyer mentioned using geopandas to get an order of magnitude speedup, which worked when I changed this line over to that. I've filled out the CLA now too. Thanks. |
It appears you forgot to set your git username for the first commit. There's also a conflicting file. Both could be fixed with a rebase. |
fccc548
to
b4976eb
Compare
Given the drop-in nature of the change, we would already be seeing the benefit downstream in all of the features that come from shapefiles. For that reason, I'm quite content for it to live where it does. I'm 👍 on this as it stands. |
Excellent, I also received the CLA form back a few days ago, but it is still blocked by the CLA checker. Is there anything I need to do to register myself here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I'll add a what's new to the docs when releasing (unless somebody beats me to it).
When it comes to testing this, we should get away with being able to mirror the BasicReader's tests, and ensuring at least one of our test matrix entries includes Fiona.
Any other reviews/comments before I merge (probably be next week now)? |
I wasn't sure whether there is an option to run the test suite with Fiona installed and with it uninstalled as well, so I didn't add anything to the tests. If you have a way of running the tests on an install w/ and w/o Fiona I think that would be worthwhile for testing whether a specific reader is giving you the errors. I'll let you add into the What's New section during the next release. Looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a nice addition. I left a comment/improvement, but I wouldn't hold this up for it.
lib/cartopy/io/shapereader.py
Outdated
:meth:`~Record.geometry` method. | ||
|
||
""" | ||
for i in range(len(self)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this be slightly cleaner and faster as:
for item in self._data:
yield item['geometry']
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dopplershift, Thanks for finding that! I've updated another spot below that as well. Too much copying from the BasicReader that was already implemented. Might want to refactor the design of the BasicReader to not require indexing in the future.
I was also struggling with the rebasing, but think I have it all updated now along with updating the "What's New" section.
lib/cartopy/io/shapereader.py
Outdated
{key: value for key, value in | ||
item.items() if key != 'geometry'}) | ||
|
||
if _HAS_FIONA: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E305 expected 2 blank lines after class or function definition, found 1
56eab4a
to
1c9e576
Compare
x_limits = self.x_limits | ||
y_limits = self.y_limits | ||
# Extend the limits a tiny amount to allow for precision mistakes | ||
epsilon = 1.e-10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this is related to the intent of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is required to fix #403. See the very first comment in this chain where I discuss fixing two open issues with this PR.
This request fixes an issue with plotting 10m Oceans seen in #403. The issue is that the 10m and 110m ocean shapefiles go beyond the (-180, 180) limits within the Projection quick transform by ~1.e-12, so I added in an epsilon value of 1.e-10 to allow the quick transform to proceed as long as it is close. The 50m Oceans don't have this issue.
lib/cartopy/io/shapereader.py
Outdated
class FionaRecord(Record): | ||
""" | ||
A single logical entry from a shapefile, combining the attributes with | ||
their associated geometry. This is extends the standard Record to work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is -> This
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed the spelling too, and tried to rebase it into my master branch now.
@greglucas If you can rebase to resolve the conflicts, I'll merge this. |
…it is installed on the system already.
@dopplershift I think I've rebased appropriately now. It looks like some of the circleci builds are failing on conda installs though. |
Thanks for the hard work @greglucas! Sorry this hung out there so long. |
Thanks, @dopplershift! |
Amazing! Thanks for doing this @greglucas - this will be a major performance improvement. 👍 |
Not sure how best to pursue this but wanted to at least drop a quick note. This ended up changing the behavior of some code of mine. In one environment I had fiona installed and so records were read in from a shapefile with fiona and my code worked. Then I created a new environment from scratch to test my code, didn't have fiona installed, and my code didn't work anymore. Stumbling upon this issue is the only reason I was able to figure out the problem, which was super mysterious otherwise. Versions of packages had not changed nor had my shapefile so it looked completely random. No idea if this is a relatively unique problem or not. Is there a reasonable way to alert users to the importance of having fiona installed for the functionality of a call like |
@kthyng, when making this PR I attempted to stick as close to the standard reader as possible in what the functions return. My initial guess is that Fiona is a little more flexible in the inputs that it will allow/handle versus the standard Shapereader. Are you able to look at the geometries/records in the two different cases to see if there are even any geometries/shapes being produced with the basic reader? Fiona right now is just an optional dependency, so it really shouldn't be important whether someone has it or not from a results perspective, only a speed perspective, therefore this is a little perplexing. |
Yes sorry I didn't give more specific detail before (I was on my way out the door). In the I am not sure how to make a minimal working example with shapefiles or if only particular shapefiles would have this problem, but it is with the dataset available here: https://github.com/kthyng/harvey_inflow/tree/master/data/shapefiles/galv_ws_nad83_prj_albers_equal_area_conic_USGS And code to reproduce the difference is: fname = 'galv_ws_nad83_prj_albers_equal_area_conic_USGS'
basins = cartopy.io.shapereader.Reader(fname).records()
next(basins) With
Without |
I think it'd make sense to open up a new issue for this. cartopy/lib/cartopy/io/shapereader.py Lines 81 to 108 in f54a49f
A possible fix may be to call If you want to test it out on your data the change would be on lines 178/238: cartopy/lib/cartopy/io/shapereader.py Line 238 in f54a49f
|
Ok I can make an issue. I'll aim to do it in the next week. Thanks for your responses. |
This request fixes an issue with plotting 10m Oceans seen in #403. The issue is that the 10m and 110m ocean shapefiles go beyond the (-180, 180) limits within the Projection quick transform by ~1.e-12, so I added in an epsilon value of 1.e-10 to allow the quick transform to proceed as long as it is close. The 50m Oceans don't have this issue.
As mentioned in #403, even with this fix the 10m Oceans are still really slow because of making shapes initially. In #250 there is a request for using Fiona to read in shapefiles. I have essentially implemented the Geopandas Fiona reader to produce the same output as the current shapereader to try and provide continuity whether you have Fiona or not. There are now two readers, a BasicReader and FionaReader, that is set based on whether the Fiona import worked or not. I'm not sure this is the best way to go about this, so I would welcome feedback.
With these two changes I can load the 10m Ocean shapefile and plot them in several seconds.