Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid overfilling buffer when reading from Azure #767

Merged
merged 1 commit into from
Sep 6, 2023

Conversation

ronreiter
Copy link
Contributor

@ronreiter ronreiter commented Apr 6, 2023

Title

when read size > chunk size, return read size and not chunk size.

bugfix: read(MAX_READ_SIZE) returns read(CHUNK_SIZE) when read size > chunk size

Motivation

The read function expects to return the maximum size it can read from a stream and not the chunk size. This causes serious bugs when reading with a parameter such as x.read(MAX_READ_SIZE), which does not expect you to re-read the buffer until you get to MAX_READ_SIZE.

This bug does not happen when the read size is not specified.

If you're fixing a bug, link to the issue number like so:

- Fixes #{issue_number}

If you're adding a new feature, then consider opening a ticket and discussing it with the maintainers before you actually do the hard work.

Tests

If you're fixing a bug, consider test-driven development:

  1. Create a unit test that demonstrates the bug. The test should fail.
  2. Implement your bug fix.
  3. The test you created should now pass.

If you're implementing a new feature, include unit tests for it.

Make sure all existing unit tests pass.
You can run them locally using:

pytest smart_open

If there are any failures, please fix them before creating the PR (or mark it as WIP, see below).

Work in progress

If you're still working on your PR, include "WIP" in the title.
We'll skip reviewing it for the time being.
Once you're ready to review, remove the "WIP" from the title, and ping one of the maintainers (e.g. mpenkov).

Checklist

Before you create the PR, please make sure you have:

  • Picked a concise, informative and complete title
  • Clearly explained the motivation behind the PR
  • Linked to any existing issues that your PR will be solving
  • Included tests for any new functionality
  • Checked that all unit tests pass

Workflow

Please avoid rebasing and force-pushing to the branch of the PR once a review is in progress.
Rebasing can make your commits look a bit cleaner, but it also makes life more difficult from the reviewer, because they are no longer able to distinguish between code that has already been reviewed, and unreviewed code.

@ronreiter ronreiter changed the title bugfix: when read size > chunk size, return read size and not chunk size bugfix: read(READ_SIZE) returns read(CHUNK_SIZE) when read size > chunk size Apr 6, 2023
@ronreiter ronreiter changed the title bugfix: read(READ_SIZE) returns read(CHUNK_SIZE) when read size > chunk size bugfix: read(MAX_READ_SIZE) returns read(CHUNK_SIZE) when read size > chunk size Apr 6, 2023
@mpenkov mpenkov added the bug label Sep 6, 2023
@mpenkov mpenkov changed the title bugfix: read(MAX_READ_SIZE) returns read(CHUNK_SIZE) when read size > chunk size Avoid overfilling buffer when reading from Azure Sep 6, 2023
@mpenkov
Copy link
Collaborator

mpenkov commented Sep 6, 2023

Thank you @ronreiter !

@mpenkov mpenkov merged commit 44c7342 into piskvorky:develop Sep 6, 2023
beck3905 pushed a commit to beck3905/smart_open that referenced this pull request Sep 6, 2023
mpenkov pushed a commit that referenced this pull request Sep 7, 2023
* fix: ignore seek requests to the current position

* fix: adjust test to match new seek behavior

* run seek if it is the first time

* Add required import for example to work (#756)

If a person were to simply copy this code block it would use the built in `open` and would not work. Adding in the correct import makes this block a bit easier for a simple copy paste.

* run tests on py3.11 (#774)

* add type command to ftp (#781)

* Add python 3.11 to setup.py (#775)

* Fixes KeyError when retrieving empty but existing object from S3 (#771)

* fix: Fixes KeyError when retrieving empty file from S3

* Add test

* bugfix: when read size > chunk size, return read size and not chunk size (#767)

* undo formatting

* fix whitespace

* undo formatting

---------

Co-authored-by: Rusty Conover <rusty@conover.me>
Co-authored-by: Christian Jensen <christian@orbik.com>
Co-authored-by: tooptoop4 <33283496+tooptoop4@users.noreply.github.com>
Co-authored-by: Raphaël Cohen <raphael.cohen.utt@gmail.com>
Co-authored-by: Ron Reiter <ron.reiter@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants