Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract features from in-memory zip files? #318

Closed
huevosabio opened this issue Jan 22, 2016 · 8 comments
Closed

How to extract features from in-memory zip files? #318

huevosabio opened this issue Jan 22, 2016 · 8 comments
Assignees
Milestone

Comments

@huevosabio
Copy link

Hi,

I have a Flask application in which at a certain point the user uploads a zip file containing all the required shapefiles and accompanying files for the project. Currently, I extract the zip file to the server's file system and fetch the features using Fiona. However, I would like to NOT use the server's file system. In Python, I can use ZipFile to avoid using the file system and yet managing to do as I please with the files. In Fiona, I know about the ability to use Virtual File Systems (VFS), but it requires me to provide a full path within the server which would still require me to use the file system.

Any ideas?

Thank you!

@sgillies
Copy link
Member

Fiona has https://github.com/Toblerity/Fiona/blob/master/fiona/collection.py#L422, which works for bytes read from a GeoJSON file, and in https://svn.osgeo.org/gdal/trunk/autotest/gcore/vsizip.py I see hints that in-memory zip files should be possible. My attempts to read bytes of a zipped shapefile and pass them into the BytesCollection aren't panning out.

This would be a nice feature, I agree. It'll take a bit of digging into http://www.gdal.org/cpl__vsi_8h.html to make it work.

@huevosabio
Copy link
Author

Thanks for the quick response. I'look into those links and see if I can come up with something once I have time... I'll work around using the file system for the moment..

@sgillies
Copy link
Member

A port of Rasterio's MemoryFile is the thing: it'll make this quite easy and fun to use.

@lpinner
Copy link

lpinner commented Jan 24, 2017

@sgillies I was playing around with extracting data from zipped shapefiles following this GIS-SE question that you responded to. I got it working, but it seems to be a limitation in GDAL (<=2.1) that the vsizip (and vsitar) handler requires a '.zip' ('.tar') extension.

From cpl_vsi.h doco:

Starting with GDAL 2.2, an alternate syntax is available so as to enable chaining and not being dependent on .zip extension

import requests

import fiona
from osgeo import gdal

request = requests.get('https://github.com/OSGeo/gdal/blob/trunk/autotest/ogr/data/poly.zip?raw=true')
vsif = fiona.ogrext.buffer_to_virtual_file(bytes(request.content))
vsiz = vsif+'.zip'
gdal.Rename(vsif, vsiz)

with fiona.Collection(vsiz, vsi='zip', layer ='poly') as poly:
    for p in poly:
        print(p)

@sgillies
Copy link
Member

@lpinner thanks for the pointer! I think I see an easy solution for version 1.7.2. Further down the road, Fiona is going to go in a different direction but making BytesCollection more useful is stepping stone.

@sgillies
Copy link
Member

@lpinner here's the new usage I'm proposing for 1.7.2:

with open('tests/data/coutwildrnp.zip', 'rb') as src:
    zip_file_bytes = src.read()
 
with fiona.BytesCollection(zip_file_bytes) as col:
    print(col.name)

# Printed:
# 'coutwildrnp'

If there are multiple layers in the zipped bytes you can access them using the layer keyword argument.

I only want to do this for zip for now. Other archive formats don't work as well with GDAL/OGR and I don't see them much in the wild.

sgillies added a commit that referenced this issue Jan 25, 2017
* Add extension to VSI filenames so we can access zipped bytes

Resolves #318

* Fix for nosetests

* Don't use os.path.join because of Windows

* Remove superfluous layer arg
@sgillies
Copy link
Member

sgillies commented Jan 26, 2017

From @atlefren on Twitter https://twitter.com/atlefren/status/824630112834445312, a related question:

Does Fiona only read files, or is it possible to give it an open file stream?

You can use BytesCollection in Fiona 1.7.1, but not for zipped streams, which means you're limited to single-file formats (GeoJSON, etc). No zipped shapefile(s) support until 1.7.2.

It's actually GDAL that's limited to reading files, though it has a neat and very flexible virtual filesystem. Fiona takes the bytes you pass to BytesCollection and maps them into an GDAL in-memory file which is then opened by OGR to read features.

@sgillies
Copy link
Member

Closed in maint-1.7, not yet merged to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants