New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add basic support for remote resources to read_file
#531
Conversation
Tested with GeoJSON files.
This might need more work, let me know if you have any suggestions! (Started at EuroScipy 2017 sprint, see also the related issue #441) |
I've added a custom marker for
and
Maybe there is already someone similar set-up already? |
Codecov Report
@@ Coverage Diff @@
## master #531 +/- ##
==========================================
- Coverage 93.73% 93.66% -0.07%
==========================================
Files 14 14
Lines 1037 1058 +21
==========================================
+ Hits 972 991 +19
- Misses 65 67 +2
Continue to review full report at Codecov.
|
Related: #464 |
geopandas/io/file.py
Outdated
be opened and *kwargs* are keyword args to be passed to the `open` | ||
or (`BytesCollection`) method in the fiona library when opening the file. | ||
For more information on possible keywords, type: | ||
``import fiona; help(fiona.open)`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be valuable to expand the docs a bit here now with this added functionality. In the from_file
method on a geodataframe, which wraps read_file
, there's a basic example. Something like that would be useful here. Additionally, putting it in numpy
style to help with effort to make formatting of docs consistent.
I was trying to add a URL example as well but I can't find a simple one with a short URL - I think I'll add something to the docs which then could be served via http://geopandas.org/ |
Just fixed some syntax in the docs, should be ok now. |
doc/source/io.rst
Outdated
|
||
gpd.read_file("https://d2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson") | ||
url = "http://geopandas.orgd2ad6b4ur7yvpq.cloudfront.net/naturalearth-3.3.0/ne_110m_land.geojson" | ||
gpd.read_file(url) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The url you changed it to isn't valid; it also isn't quite as clear anymore with the preceding sentence about reading a GeoJSON file from geojson.xyz (which is the case for the original url, though given that geojson.xyz doesn't appear in the url anywhere, it is not totally transparent then either).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I had started thinking about using a "geopandas.org" URL, then realized it would probably not be a nice URL either and concentrated on fixing the syntax errors.
I amended the commit and all should now work again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you push the change? I am still seeing the url with geopandas in the diff online?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just did, Monday morning ... (thanks for the heads-up!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Added a few minor comments
doc/source/io.rst
Outdated
@@ -19,6 +19,12 @@ Any arguments passed to ``read_file()`` after the file name will be passed direc | |||
|
|||
Among other things, one can explicitly set the driver (shapefile, GeoJSON) with the ``driver`` keyword, or pick a single layer from a multi-layered file with the ``layer`` keyword. | |||
|
|||
Where supported in ``fiona`` *geopandas* can also load resources directly from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put a comma between fiona and geopandas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
geopandas/io/file.py
Outdated
|
||
Examples | ||
-------- | ||
>>> df = geopandas.read_file("nybb.shp") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no indentation is needed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
geopandas/io/tests/test_io.py
Outdated
@@ -54,6 +54,13 @@ def test_read_file(self): | |||
lower_columns = [c.lower() for c in self.columns] | |||
assert (df.columns[:-1] == lower_columns).all() | |||
|
|||
@pytest.mark.webtest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe just "web" instead of "webtest" ?
But it's fine to add a mark like this, we don't have anything set up like this yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, had just taken it from the py.test example in their docs.
Thanks @rgieseke! |
Thanks @jdmcbr and @jorisvandenbossche ! |
* Add basic support for remote resources to `read_file` Tested with GeoJSON files. * TST: Use GeoJSON file from GeoPandas repo * Switch to numpydoc style * Fix syntax in docs * Formatting and rename test marker (cherry picked from commit a43d166)
* Add basic support for remote resources to `read_file` Tested with GeoJSON files. * TST: Use GeoJSON file from GeoPandas repo * Switch to numpydoc style * Fix syntax in docs * Formatting and rename test marker (cherry picked from commit a43d166)
When will the pypi version of geopandas include this feature? I installed from the master branch and was able to load the remote file. However, I couldn't read the remote file from the pypi Also, how feasible is it to add an optional argument for a cert file to handle requests to an https file? |
On the
|
Always above 2.7.9 and up to 3.6. Anaconda versions. |
I might have mis-understood your use-case, do you want to use a local cert file? |
Yes, you are correct. This is the requests equivalent: import requests
cert = <path to cert>
r = requests.get(url,cert =cert, verify=False) Documented here: http://docs.python-requests.org/en/master/user/advanced/#client-side-certificates If we can pass a |
There has recently been some discussion about this on the pandas issue tracker: pandas-dev/pandas#16716 (and subsequent PRs). Can be used as inspiration or reused what comes out of that |
Tested with GeoJSON files.