Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple input sources #2

Merged
merged 15 commits into from Apr 11, 2019

Conversation

ian-r-rose
Copy link
Collaborator

Fixes #1.

Still need to add a version for PostGIS and write a bunch of tests, but I wanted to get your feedback sooner rather than later @jacobtomlinson. What do you think?

@martindurant
Copy link
Member

I wouldn't bother explicitly labelling the superclass as abstract, since you might find later you want to have a geopandas-specific container (as opposed to general dataframe) and this would be the obvious place to do it.

@ian-r-rose
Copy link
Collaborator Author

I'm not sure I understand @martindurant, can you elaborate? My thought here was that these DataSources were per-data-source, and the GeoPandasSource was deferring how to load from different sources to its subclasses. This seems somewhat orthogonal to whether to elevate a GeoDataFrame to its own container-type.

@martindurant
Copy link
Member

They are indeed two separate issues, and you are right in your understanding - I was just thinking that the superclass could also be the container class. Or maybe not - just a suggestion. I just personally never actually label a class explicitly as abstract, but maybe that's just my style.

@ian-r-rose
Copy link
Collaborator Author

Got it, thanks. I'm spending some significant time in Python right now after a long-ish time in Javascript, so I'm still playing around with style choices : )

@ian-r-rose
Copy link
Collaborator Author

ian-r-rose commented Apr 9, 2019

ping @jacobtomlinson any thoughts here?

Copy link
Collaborator

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I'd left it as the title still says WIP.

This looks awesome! Thanks for taking the time to do this.

The only issue I've seen is the missing name for the GeoPandasSQLSource.

There are a few lint issues that have been introduced, but I doubt it was correct before anyway. Feel free to optionally correct those.

class ShapefileSource(GeoPandasFileSource):
name="shapefile"

class GeoPandasSQLSource(GeoPandasSource):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing name attribute.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention here was that this not actually be instantiated, and that users either select a postgis or sptialite source. It may end up being a distinction without a difference, but I kind of liked being more explicit about the source type. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to make sure that the correct classes and names end up in intake.registry. That could be as simple as not exposing classes you don't intend for end-users in the top level of the package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be discovered by the registry as intended. The GeoPandasSQLSource is not exported in the __init__.py above.

intake_geopandas/geopandas.py Show resolved Hide resolved
@ian-r-rose
Copy link
Collaborator Author

Thanks for taking a look @jacobtomlinson! I still have it marked as WIP as I had hoped to get some tests working for Spatialite at least. I've had some trouble getting that working in CI, though.

@ian-r-rose
Copy link
Collaborator Author

Okay, this is ready for another look. I've added a couple of SpatiaLite tests that are based off of ones in GeoPandas. Unfortunately, they rely on bugfixes that are not in a published version, but I have verified that they pass locally using geopandas master, and marked them as xfail for now.

@ian-r-rose ian-r-rose changed the title [WIP] Multiple input sources Multiple input sources Apr 10, 2019
Copy link
Collaborator

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Thanks for this.

@jacobtomlinson jacobtomlinson merged commit 42b31d8 into intake:master Apr 11, 2019
@jacobtomlinson
Copy link
Collaborator

Released in 0.2.0

@ian-r-rose ian-r-rose deleted the multiple-input-sources branch April 11, 2019 15:21
@ian-r-rose
Copy link
Collaborator Author

Thanks for the review and quick publish!

@martindurant
Copy link
Member

Does this mean that intake-dcat can now be released?

@ian-r-rose
Copy link
Collaborator Author

I am waiting on dask/dask#4634, which is required for it to be able to access datasets from Socrata sites. Then I plan to publish a quick release to PyPI. It would be useful if we could also publish intake_geopandas there as well.

@martindurant
Copy link
Member

Yes, please, to all of that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants