New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beginnings of s3 support #426

Merged
merged 18 commits into from Nov 14, 2017

Conversation

Projects
None yet
6 participants
@youngpm
Contributor

youngpm commented Mar 4, 2017

@sgillies Here's the beginnings of some progress on #425. You can open a Collection with vsi=http or vsi=zip+http and it all works. I'm pointing at remote files in gdal's SVN repo in the couple of tests I added, not totally sure how you want to handle tests that live off the machine, particularly when I get to the S3 stuff.

Before I started, I made the test I wrote in #423 conform to the fixture style used in master (1st commit) and I had a test that was failing for me that I fixed (2nd commit).

Let me know what you think and I'll continue chipping away.

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Mar 6, 2017

Contributor

My http tests are failing because travis is sandboxing the http calls, not quite sure what to do about it. I tried serving it locally via the pytest-localhttp plugin,

def test_open_zip_http_local(httpserver, bytes_coutwildrnp_zip):
    httpserver.serve_content(bytes_coutwildrnp_zip)
    ds = fiona.open('zip+' + httpserver.url)

but no luck getting that to read, even from ogr. I suspect it doesn't recognize its a zip, I can curl from localhost when its serving just fine and upzip away.

Contributor

youngpm commented Mar 6, 2017

My http tests are failing because travis is sandboxing the http calls, not quite sure what to do about it. I tried serving it locally via the pytest-localhttp plugin,

def test_open_zip_http_local(httpserver, bytes_coutwildrnp_zip):
    httpserver.serve_content(bytes_coutwildrnp_zip)
    ds = fiona.open('zip+' + httpserver.url)

but no luck getting that to read, even from ogr. I suspect it doesn't recognize its a zip, I can curl from localhost when its serving just fine and upzip away.

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Mar 6, 2017

Member

@youngpm I tried fio info zip+http://svn.osgeo.org/gdal/trunk/autotest/ogr/data/poly.zip and got a result that looks great. This file is smaller than the VSI chunk size, however, and doesn't demonstrate any benefits other than not writing to disk. I've uploaded Fiona's coutwildrnp.zip to the Mapbox bucket and you're welcome to use that. If you do

CPL_CURL_VERBOSE=1 fio info zip+https://s3.amazonaws.com/mapbox/rasterio/coutwildrnp.zip

You'll see that it makes 6 partial GET requests for a total of 88470 bytes, about half of the zip file's total bytes.

I'll give the code a closer look now.

Member

sgillies commented Mar 6, 2017

@youngpm I tried fio info zip+http://svn.osgeo.org/gdal/trunk/autotest/ogr/data/poly.zip and got a result that looks great. This file is smaller than the VSI chunk size, however, and doesn't demonstrate any benefits other than not writing to disk. I've uploaded Fiona's coutwildrnp.zip to the Mapbox bucket and you're welcome to use that. If you do

CPL_CURL_VERBOSE=1 fio info zip+https://s3.amazonaws.com/mapbox/rasterio/coutwildrnp.zip

You'll see that it makes 6 partial GET requests for a total of 88470 bytes, about half of the zip file's total bytes.

I'll give the code a closer look now.

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Mar 6, 2017

Member

@youngpm your changes look good to me. I don't see anything I'd do differently.

Unless pytest-localhttp supports HEAD and partial GET requests, it won't be any use in testing this feature of GDAL. I don't quite understand what you mean by sandboxing. In testing Rasterio, we're able to make HTTPS requests without problems.

Member

sgillies commented Mar 6, 2017

@youngpm your changes look good to me. I don't see anything I'd do differently.

Unless pytest-localhttp supports HEAD and partial GET requests, it won't be any use in testing this feature of GDAL. I don't quite understand what you mean by sandboxing. In testing Rasterio, we're able to make HTTPS requests without problems.

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Mar 6, 2017

Contributor

@sgillies Thanks! I was thinking of the sandboxing because I am seeing this in Travis,

fiona.errors.FionaValueError: No dataset found at path '/vsizip/vsicurl/http://svn.osgeo.org/gdal/trunk/autotest/ogr/data/poly.zip' using drivers: *

which works locally for both of us. I'll take a closer look at what rasterio is doing! Should have time this week to keep at it, excited to get this in there!

Contributor

youngpm commented Mar 6, 2017

@sgillies Thanks! I was thinking of the sandboxing because I am seeing this in Travis,

fiona.errors.FionaValueError: No dataset found at path '/vsizip/vsicurl/http://svn.osgeo.org/gdal/trunk/autotest/ogr/data/poly.zip' using drivers: *

which works locally for both of us. I'll take a closer look at what rasterio is doing! Should have time this week to keep at it, excited to get this in there!

youngpm added some commits Mar 4, 2017

Make bytescollection tests use fixtures.
One test wasn't, now it does.
test_fio_ls_multi_layer test failure corrected.
The test that fio ls returned the layers depended on the order of the
layers, but on this system (Ubuntu 16.04, Python 3.5) the order was
reversed.
Beginnings of vsicurl support.
We can now open remote files via vsicurl, including remote zip files.
fiona.open can now open http uris.
Works for http + zip and presumably other combos too.
Initial S3 support.
E.G. fio info "s3://your-bucket/your-key" pointing at something OGR can
read should work.
@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Mar 8, 2017

Contributor

@sgillies The latest commit should let you do (after you install boto3, didn't update the dependencies yet!)

fio info s3://your-bucket/your-key

It was surprisingly easy to enable this! I haven't put any tests in around this yet.

In rasterio, the env stuff is doing a lot more, I just subclassed GDALEnv, and I did it in sort of a weird place. Do you want to expose all the aws overrides like in rasterio? I personally like forcing folks to pull it from envvars/iam roles/creds file rather than directly specifying it to avoid credentials accidentally getting committed. Let me know what you think.

Contributor

youngpm commented Mar 8, 2017

@sgillies The latest commit should let you do (after you install boto3, didn't update the dependencies yet!)

fio info s3://your-bucket/your-key

It was surprisingly easy to enable this! I haven't put any tests in around this yet.

In rasterio, the env stuff is doing a lot more, I just subclassed GDALEnv, and I did it in sort of a weird place. Do you want to expose all the aws overrides like in rasterio? I personally like forcing folks to pull it from envvars/iam roles/creds file rather than directly specifying it to avoid credentials accidentally getting committed. Let me know what you think.

@sgillies

@youngpm I'm happy with the way you've done this. I'm okay with leaving out the overrides for now, though I may add them in the future. If I get temporary credentials from STS, for example, I'd like to be able to plug them directly in instead of writing to the credentials file or modifying environment variables.

Show outdated Hide outdated fiona/_drivers.pyx
@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Mar 17, 2017

Contributor

Sorry I've been out most of the week, but should have time to refine this weekend.

Contributor

youngpm commented Mar 17, 2017

Sorry I've been out most of the week, but should have time to refine this weekend.

@Juanlu001

This comment has been minimized.

Show comment
Hide comment
@Juanlu001

Juanlu001 Aug 22, 2017

Contributor

This would be a great addition to Fiona! 👏 cc @slava-kerner @luciotorre

Contributor

Juanlu001 commented Aug 22, 2017

This would be a great addition to Fiona! 👏 cc @slava-kerner @luciotorre

@sgillies sgillies added this to the 1.8.0 milestone Nov 5, 2017

@sgillies sgillies added the enhancement label Nov 5, 2017

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Nov 5, 2017

Member

@youngpm I'm making a plan for the 1.8.0 release and would love to get this in. Any chance you could resolve the conflict above?

Member

sgillies commented Nov 5, 2017

@youngpm I'm making a plan for the 1.8.0 release and would love to get this in. Any chance you could resolve the conflict above?

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Nov 6, 2017

Contributor

@sgillies yeah I'll get on it today :)

Contributor

youngpm commented Nov 6, 2017

@sgillies yeah I'll get on it today :)

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Nov 7, 2017

Contributor

@sgillies I think this is ready, but the test failures in Travis are confusing me. The network tests work fine for me locally but are failing in Travis. The appveyor ones are even weirder, not sure whats going on there. Any thoughts?

Contributor

youngpm commented Nov 7, 2017

@sgillies I think this is ready, but the test failures in Travis are confusing me. The network tests work fine for me locally but are failing in Travis. The appveyor ones are even weirder, not sure whats going on there. Any thoughts?

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm
Contributor

youngpm commented Nov 7, 2017

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Nov 10, 2017

Member

@youngpm I'll take a close look at the build on Travis tomorrow.

Member

sgillies commented Nov 10, 2017

@youngpm I'll take a close look at the build on Travis tomorrow.

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Nov 11, 2017

Member

@youngpm we've got a sufficient GDAL build, the only problem is that we'd cached an insufficient build and that was getting picked up instead. Once I deleted Travis's master branch, all the curl tests pass 🎉 I'm sorry I didn't think of this earlier.

Member

sgillies commented Nov 11, 2017

@youngpm we've got a sufficient GDAL build, the only problem is that we'd cached an insufficient build and that was getting picked up instead. Once I deleted Travis's master branch, all the curl tests pass 🎉 I'm sorry I didn't think of this earlier.

@sgillies

This comment has been minimized.

Show comment
Hide comment
@sgillies

sgillies Nov 11, 2017

Member

Nice work!

$ fio cat zip+s3://mapbox/rasterio/coutwildrnp.zip | fio collect | geojsonio

untitled

Member

sgillies commented Nov 11, 2017

Nice work!

$ fio cat zip+s3://mapbox/rasterio/coutwildrnp.zip | fio collect | geojsonio

untitled

's3': ['boto3>=1.2.4'],
'test': ['pytest>=3', 'pytest-cov', 'boto3>=1.2.4', 'packaging'],
}

This comment has been minimized.

@sgillies

sgillies Nov 11, 2017

Member

Great!

@sgillies

sgillies Nov 11, 2017

Member

Great!

@sgillies

@youngpm thank you! 1.8.0 is going to 🚀

I'm requesting a few small changes.

@youngpm

This comment has been minimized.

Show comment
Hide comment
@youngpm

youngpm Nov 13, 2017

Contributor

@sgillies thanks for the review! I'll clean this up tonight!

Contributor

youngpm commented Nov 13, 2017

@sgillies thanks for the review! I'll clean this up tonight!

@coveralls

This comment has been minimized.

Show comment
Hide comment
@coveralls

coveralls Nov 14, 2017

Coverage Status

Coverage increased (+0.6%) to 76.489% when pulling f121726 on youngpm:s3-support into add7751 on Toblerity:master.

coveralls commented Nov 14, 2017

Coverage Status

Coverage increased (+0.6%) to 76.489% when pulling f121726 on youngpm:s3-support into add7751 on Toblerity:master.

@sgillies sgillies merged commit 22c7389 into Toblerity:master Nov 14, 2017

2 of 3 checks passed

continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.6%) to 76.489%
Details

@AndrewAnnex AndrewAnnex referenced this pull request Jan 16, 2018

Closed

cut release of 1.8a2? #534

else:
self.env = GDALEnv()
self.env = AWSGDALEnv()

This comment has been minimized.

@QuLogic

QuLogic Sep 25, 2018

Contributor

Is this right? Both sides of the if are the same.

I think these lines are forcing boto3 to be required even if not opening an s3 link.

@QuLogic

QuLogic Sep 25, 2018

Contributor

Is this right? Both sides of the if are the same.

I think these lines are forcing boto3 to be required even if not opening an s3 link.

This comment has been minimized.

@sgillies

sgillies Sep 25, 2018

Member

@QuLogic please see #629 for the current work on cloud storage support in Fiona.

@sgillies

sgillies Sep 25, 2018

Member

@QuLogic please see #629 for the current work on cloud storage support in Fiona.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment