Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SeekableBufferedInputBase.read(0) returns data #171

Closed
janpom opened this issue Feb 13, 2018 · 5 comments
Closed

SeekableBufferedInputBase.read(0) returns data #171

janpom opened this issue Feb 13, 2018 · 5 comments

Comments

@janpom
Copy link

janpom commented Feb 13, 2018

Steps to reproduce:

from smart_open.s3 import SeekableBufferedInputBase
input = SeekableBufferedInputBase(...)
assert input.read(0) == b''

I quickly investigated the problem. It seems the bug is in the read() method in the parent BufferedInputBase class:

    def read(self, size=-1):
        """Read up to size bytes from the object and return them."""
        if size <= 0:
            if len(self._buffer):
                from_buf = self._read_from_buffer(len(self._buffer))
            else:
                from_buf = b''
            self._current_pos = self._content_length
            return from_buf + self._raw_reader.read()

       ...

if size < 0 maybe?

@menshikh-iv
Copy link
Contributor

CC: @mpenkov

@janpom
Copy link
Author

janpom commented Feb 13, 2018

A quick workaround is to subclass SeekableBufferedInputBase and handle the special case correctly:

from smart_open.s3 import SeekableBufferedInputBase

class S3SeekableInput(SeekableBufferedInputBase):
    def read(self, size=-1):
        if size == 0:
            return b''
        else:
            return super(S3SeekableInput, self).read(size)

@mpenkov
Copy link
Collaborator

mpenkov commented Feb 13, 2018

@janpom I think that fixing the superclass' method is preferable to the override. Are you able to submit a PR? If not, I'll look into it during the week.

@janpom
Copy link
Author

janpom commented Feb 13, 2018

@mpenkov I wasn't suggesting the workaround as a fix. I was merely pointing out that users can work around the problem (outside of smart_open code).

The fix should definitely be be in the BufferedInputBase. Some more investigation would be worthwhile. The size < 0 is just my wild guess. I don't really have time for delivering a proper fix, sorry.

@mpenkov
Copy link
Collaborator

mpenkov commented Feb 14, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants