Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compression_wrapper compatibility issue #332

Closed
piskvorky opened this issue Jun 30, 2019 · 4 comments · Fixed by #333
Closed

compression_wrapper compatibility issue #332

piskvorky opened this issue Jun 30, 2019 · 4 comments · Fixed by #333

Comments

@piskvorky
Copy link
Owner

piskvorky commented Jun 30, 2019

This used to work in 1.5.6, perhaps even later:

import smart_open

some_fp = io.BytesIO()  # any file-like objectwrapped = smart_open.compression_wrapper(some_fp, some_filename, 'rb')

It doesn't work now, smart_open.compression_wrapper no longer exists.

What's the reason? I cannot find anything in the CHANGELOG.

@piskvorky piskvorky changed the title Compatibility issue compression_wrapper compatibility issue Jun 30, 2019
@piskvorky piskvorky changed the title compression_wrapper compatibility issue compression_wrapper compatibility issue Jun 30, 2019
@piskvorky
Copy link
Owner Author

piskvorky commented Jun 30, 2019

Upon investigation, it seems like:

from smart_open.smart_open_lib import _compression_wrapper
smart_open.compression_wrapper = _compression_wrapper

will restore this (desirable) functionality.

@mpenkov I'd strongly prefer to avoid such monkey-patching shenanigans. Do you see a better way?

If not, can we bring compression_wrapper back? (and perhaps other useful functions that suffered a similar fate)

@mpenkov
Copy link
Collaborator

mpenkov commented Jun 30, 2019

What is the use case here? What are you attempting to achieve, and what are the parameters you're passing to the wrapper?

@piskvorky
Copy link
Owner Author

piskvorky commented Jun 30, 2019

Parameter 1 = io.BufferedIOBase, parameter 2 = filename str, parameter 3 = mode = 'rb'.

We're wrapping an open file-like binary object so that it decompresses transparently.

@mpenkov
Copy link
Collaborator

mpenkov commented Jun 30, 2019

I demoted compression_wrapper to be an internal function (it's called _compression_wrapper) when I redesigned the smart_open library over a year ago. My reasoning for making it internal was:

  • The function wasn't publicly documented anywhere, unlike e.g. the obviously public smart_open and s3_iter
  • There was no example code for using that function
  • There were no unit tests exercising that function on its own

The main benefits of demoting such functions to internal is a cleaner API.

Here's a better way to satisfy your use case than the import-rename pattern you suggested:

>>> import io, gzip
>>>
>>> # prepare some gzipped data
>>> buf = io.BytesIO()
>>> with gzip.GzipFile(fileobj=buf, 'w') as fout: fout.write(b'this is a bytestring')
>>> buf.seek(0)
0
>>> # use case starts here
>>> buf.name = 'file.gz' # add a .name attribute so smart_open knows what compressor to use
>>> import smart_open
>>> smart_open.open(buf, 'rb').read() # will gzip-decompress transparently
b'this is a bytestring'

If your file object already has an appropriate .name, then you don't have to add one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants