Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating a Fiona Reader for shapefiles #1000

Merged
merged 1 commit into from Jul 15, 2018

Conversation

greglucas
Copy link
Contributor

@greglucas greglucas commented Jan 9, 2018

This request fixes an issue with plotting 10m Oceans seen in #403. The issue is that the 10m and 110m ocean shapefiles go beyond the (-180, 180) limits within the Projection quick transform by ~1.e-12, so I added in an epsilon value of 1.e-10 to allow the quick transform to proceed as long as it is close. The 50m Oceans don't have this issue.

As mentioned in #403, even with this fix the 10m Oceans are still really slow because of making shapes initially. In #250 there is a request for using Fiona to read in shapefiles. I have essentially implemented the Geopandas Fiona reader to produce the same output as the current shapereader to try and provide continuity whether you have Fiona or not. There are now two readers, a BasicReader and FionaReader, that is set based on whether the Fiona import worked or not. I'm not sure this is the best way to go about this, so I would welcome feedback.

With these two changes I can load the 10m Ocean shapefile and plot them in several seconds.

self._data = []

with fiona.open(filename) as f:
crs = f.crs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F841 local variable 'crs' is assigned to but never used

@pelson
Copy link
Member

pelson commented Jan 11, 2018

Awesome. Thanks for doing this @greglucas. I've long wanted to benefit from the performance that OGR/Fiona brings. Out of interest, did you consider making this a Feature? https://github.com/SciTools/cartopy/blob/master/lib/cartopy/feature.py#L147

How have you made use of this reader so far? If it is simply to be able to put it onto a matplotlib based GeoAxes, I'd be very tempted to look at following the Feature route. If it is to inspect and interact with the shapefile metadata you've put the code in the right place.

If you haven't already, would you mind filling in the CLA at http://scitools.org.uk/governance.html#contributors.

Thanks in advance.

@greglucas
Copy link
Contributor Author

@pelson, The main reason I needed this was for plotting the fine scale (10m) ocean features faster. So, you're correct (for me at least) it is just being able to plot background/features on a GeoAxes.

As for making it a Feature, I'm not sure what this would entail as there is currently no 'reader' within that class. The real benefit from this is in the reading of the shapefiles themselves to create the geometries. In #250 @shoyer mentioned using geopandas to get an order of magnitude speedup, which worked when I changed this line over to that.
https://github.com/SciTools/cartopy/blob/master/lib/cartopy/feature.py#L192
The reason I did it the current way is to just plug and play with the current storage of shapefile geometries/records without needing to worry about what reader you're using.

I've filled out the CLA now too. Thanks.

@QuLogic
Copy link
Member

QuLogic commented Jan 11, 2018

It appears you forgot to set your git username for the first commit. There's also a conflicting file. Both could be fixed with a rebase.

@greglucas greglucas force-pushed the master branch 2 times, most recently from fccc548 to b4976eb Compare January 12, 2018 04:06
@pelson pelson added the Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form label Jan 12, 2018
@greglucas
Copy link
Contributor Author

@QuLogic, I think I've got the commits fixed up with the right author now. However, if @pelson would rather have this in the Features section I'm willing to wait for ideas on that.

@pelson
Copy link
Member

pelson commented Jan 12, 2018

However, if @pelson would rather have this in the Features section I'm willing to wait for ideas on that.

Given the drop-in nature of the change, we would already be seeing the benefit downstream in all of the features that come from shapefiles. For that reason, I'm quite content for it to live where it does.

I'm 👍 on this as it stands.

@greglucas
Copy link
Contributor Author

Excellent, I also received the CLA form back a few days ago, but it is still blocked by the CLA checker. Is there anything I need to do to register myself here?

@pelson pelson added Type: Enhancement SemVer: minor Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form and removed Blocked: CLA needed See https://scitools.org.uk. Submit the form at: https://scitools.org.uk/cla/v4/form labels Jan 17, 2018
Copy link
Member

@pelson pelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I'll add a what's new to the docs when releasing (unless somebody beats me to it).

When it comes to testing this, we should get away with being able to mirror the BasicReader's tests, and ensuring at least one of our test matrix entries includes Fiona.

@pelson
Copy link
Member

pelson commented Jan 17, 2018

Any other reviews/comments before I merge (probably be next week now)?

@greglucas
Copy link
Contributor Author

I wasn't sure whether there is an option to run the test suite with Fiona installed and with it uninstalled as well, so I didn't add anything to the tests. If you have a way of running the tests on an install w/ and w/o Fiona I think that would be worthwhile for testing whether a specific reader is giving you the errors.

I'll let you add into the What's New section during the next release.

Looks good to me.

Copy link
Contributor

@dopplershift dopplershift left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a nice addition. I left a comment/improvement, but I wouldn't hold this up for it.

:meth:`~Record.geometry` method.

"""
for i in range(len(self)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be slightly cleaner and faster as:

for item in self._data:
    yield item['geometry']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dopplershift, Thanks for finding that! I've updated another spot below that as well. Too much copying from the BasicReader that was already implemented. Might want to refactor the design of the BasicReader to not require indexing in the future.

I was also struggling with the rebasing, but think I have it all updated now along with updating the "What's New" section.

{key: value for key, value in
item.items() if key != 'geometry'})

if _HAS_FIONA:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E305 expected 2 blank lines after class or function definition, found 1

@greglucas greglucas force-pushed the master branch 3 times, most recently from 56eab4a to 1c9e576 Compare January 18, 2018 17:59
x_limits = self.x_limits
y_limits = self.y_limits
# Extend the limits a tiny amount to allow for precision mistakes
epsilon = 1.e-10
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this is related to the intent of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is required to fix #403. See the very first comment in this chain where I discuss fixing two open issues with this PR.

This request fixes an issue with plotting 10m Oceans seen in #403. The issue is that the 10m and 110m ocean shapefiles go beyond the (-180, 180) limits within the Projection quick transform by ~1.e-12, so I added in an epsilon value of 1.e-10 to allow the quick transform to proceed as long as it is close. The 50m Oceans don't have this issue.

class FionaRecord(Record):
"""
A single logical entry from a shapefile, combining the attributes with
their associated geometry. This is extends the standard Record to work
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is -> This

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have fixed the spelling too, and tried to rebase it into my master branch now.

@dopplershift
Copy link
Contributor

@greglucas If you can rebase to resolve the conflicts, I'll merge this.

@greglucas
Copy link
Contributor Author

@dopplershift I think I've rebased appropriately now. It looks like some of the circleci builds are failing on conda installs though.

@dopplershift dopplershift merged commit c020372 into SciTools:master Jul 15, 2018
@dopplershift
Copy link
Contributor

Thanks for the hard work @greglucas! Sorry this hung out there so long.

@greglucas
Copy link
Contributor Author

Thanks, @dopplershift!

@pelson
Copy link
Member

pelson commented Jul 25, 2018

Amazing! Thanks for doing this @greglucas - this will be a major performance improvement. 👍

@kthyng
Copy link

kthyng commented Nov 12, 2019

Not sure how best to pursue this but wanted to at least drop a quick note.

This ended up changing the behavior of some code of mine. In one environment I had fiona installed and so records were read in from a shapefile with fiona and my code worked. Then I created a new environment from scratch to test my code, didn't have fiona installed, and my code didn't work anymore. Stumbling upon this issue is the only reason I was able to figure out the problem, which was super mysterious otherwise. Versions of packages had not changed nor had my shapefile so it looked completely random. No idea if this is a relatively unique problem or not.

Is there a reasonable way to alert users to the importance of having fiona installed for the functionality of a call like
reader = cartopy.io.shapereader.Reader(fname)?

@greglucas
Copy link
Contributor Author

@kthyng, when making this PR I attempted to stick as close to the standard reader as possible in what the functions return. My initial guess is that Fiona is a little more flexible in the inputs that it will allow/handle versus the standard Shapereader. Are you able to look at the geometries/records in the two different cases to see if there are even any geometries/shapes being produced with the basic reader?

Fiona right now is just an optional dependency, so it really shouldn't be important whether someone has it or not from a results perspective, only a speed perspective, therefore this is a little perplexing.

@kthyng
Copy link

kthyng commented Nov 13, 2019

Yes sorry I didn't give more specific detail before (I was on my way out the door).

In the fiona case, the records in my shapefile are read in as shapely Polygons, but without fiona they are read in as shapely MultiPolygons. They are just single polygons so Polygon seems to be the correct thing.

I am not sure how to make a minimal working example with shapefiles or if only particular shapefiles would have this problem, but it is with the dataset available here: https://github.com/kthyng/harvey_inflow/tree/master/data/shapefiles/galv_ws_nad83_prj_albers_equal_area_conic_USGS

And code to reproduce the difference is:

fname = 'galv_ws_nad83_prj_albers_equal_area_conic_USGS'
basins = cartopy.io.shapereader.Reader(fname).records()
next(basins)

With fiona, the result is

<Record: <shapely.geometry.polygon.Polygon object at 0x12276dc88>, {'WS_ID': 10030, 'area': 1013.3, 'WS_type': 'Gauged (Lake Houston)'}, >

Without fiona is it the same except for it is a shapely Multipolygon object.

@greglucas
Copy link
Contributor Author

I think it'd make sense to open up a new issue for this.
It looks like the issue is probably somewhere in the _make_geometry routines. At first glance, it looks like in the BasicReader implementation every polygon will be a multi-polygon, even if a shape is only a polygon.

def _create_polygon(shape):
if not shape.points:
return sgeom.MultiPolygon()
# Partition the shapefile rings into outer rings/polygons (clockwise) and
# inner rings/holes (anti-clockwise).
parts = list(shape.parts) + [None]
bounds = zip(parts[:-1], parts[1:])
outer_polygons_and_holes = []
inner_polygons = []
for lower, upper in bounds:
polygon = sgeom.Polygon(shape.points[slice(lower, upper)])
if polygon.exterior.is_ccw:
inner_polygons.append(polygon)
else:
outer_polygons_and_holes.append((polygon, []))
# Find the appropriate outer ring for each inner ring.
# aka. Group the holes with their containing polygons.
for inner_polygon in inner_polygons:
for outer_polygon, holes in outer_polygons_and_holes:
if outer_polygon.contains(inner_polygon):
holes.append(inner_polygon.exterior.coords)
break
polygon_defns = [(outer_polygon.exterior.coords, holes)
for outer_polygon, holes in outer_polygons_and_holes]
return sgeom.MultiPolygon(polygon_defns)

A possible fix may be to call sgeom.shape in the BasicReader rather than_make_geometry, but I don't know why those routines are there and if they are needed for anything else.

If you want to test it out on your data the change would be on lines 178/238:

yield _make_geometry(geometry_factory, shape)

@kthyng
Copy link

kthyng commented Nov 13, 2019

Ok I can make an issue. I'll aim to do it in the next week. Thanks for your responses.

@greglucas
Copy link
Contributor Author

@kthyng Actually, I just realized I had already implemented this and it is in master (no wonder it seemed familiar). Unfortunately, it was just after the previous release it looks like, so you would have to build from master to get this capability.
See: #1210
and issue #1217

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants