Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for opening a dataset memory without writing it to disc #99

Open
labarababa opened this issue Aug 13, 2019 · 5 comments
Open
Labels
enhancement New feature or request prioriy - low Low priority issue

Comments

@labarababa
Copy link

Is it possible to read a Dataset from a Zipfile without writing to disc?

This works like a charm:

with gzip.open('some_file', "rb") as f_in:
     with open("tmp", "wb") as f_out:
           shutil.copyfileobj(f_in, f_out)
grib_files = cfgrib.open_datasets("tmp", backend_kwargs={"indexpath": ""})

I want something like that:

with gzip.GzipFile('some_file', 'rb') as zipfile:
    bytes_content = zipfile.read()
grib_files = cfgrib.open_datasets(zipfile, backend_kwargs={"indexpath": ""})

But i am getting the following error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 4: invalid start byte

Any ideas?

Kind Regards.

@alexamici alexamici added the enhancement New feature or request label Aug 15, 2019
@alexamici
Copy link
Contributor

@labarababa I see the point for reading a GRIB from a string or from an opened file, however it is not a trivial update and I don't see it as high priority. Anyway I'll keep the issue open as a feature request.

@alexamici alexamici added the prioriy - low Low priority issue label Aug 15, 2019
@Plantain
Copy link

This is something that would be useful to me as well - the loading from memory part rather than the zip part.
Currently we download GRIB2 files to memory with python, then write them out to disk solely in order to be able to open them with cfgrib.
Perhaps this bug should be renamed as I'm not sure the gzip part is relevant.
If you give me some pointers I will take a look at getting this implemented.

@alexamici alexamici changed the title Dataset from Zipfile without writing to disc Add support for opening a dataset memory without writing it to disc Sep 24, 2019
@alexamici
Copy link
Contributor

Note that this is really a limitation in either ecCodes or in the internal eccodes bindings. In the near feature I'll switch to use the new ecmwf/eccodes-python package to bind to ecCodes and this feature request will belong there.

@juhi24
Copy link

juhi24 commented Jan 13, 2021

I'm also looking forward for this feature. In the best case scenario, cfgrib.read_datasets would accept a file-like object. In addition to in-memory objects, this could enable stuff like reading files directly from S3 object storage using boto3.

@TomAugspurger
Copy link

I think ecmwf/eccodes-python#25 is the relevant issue in eccodes-python. https://confluence.ecmwf.int/display/UDOC/How+do+I+decode+messages+from+a+byte+stream+-+ecCodes+FAQ is maybe relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request prioriy - low Low priority issue
Projects
None yet
Development

No branches or pull requests

5 participants