- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 33.3k
 
gh-140607: Validate returned byte count in RawIOBase.read #140611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
While `RawIOBase.readinto` should return a count of bytes between 0 and the length of the given buffer, it is not required to. Add validation inside RawIOBase.read that the returned byte count is reasonable.
| 
           It looks like the IOBase C implementation has never checked range here. I think it's reasonable to back port to bugfix releases as a ValueError could be raised in other cases (not adding a new exception; just more cases) and a bad value allows access to unallocated memory. Cherry picks will likely need to be manual to cross the I/O test refactoring. The  cc: @vstinner  | 
    
Co-authored-by: Shamil <ashm.tech@proton.me>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| if (bytes_filled == -1 && PyErr_Occurred()) { | ||
| Py_DECREF(b); | ||
| return NULL; | ||
| } | ||
| if (bytes_filled < 0 || bytes_filled > n) { | ||
| Py_DECREF(b); | ||
| PyErr_Format(PyExc_ValueError, | ||
| "readinto returned '%zd' outside buffer size '%zd'", | ||
| bytes_filled, n); | ||
| return NULL; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (bytes_filled == -1 && PyErr_Occurred())
    goto error;
if (bytes_filled < 0 || bytes_filled > n) {
    PyErr_Format(PyExc_ValueError,
                 "readinto returned '%zd' outside buffer size '%zd'",
                 bytes_filled, n);
    goto error;
}
error:
    Py_DECREF(b);
    return NULL;
maybe do it this way to remove duplication and improve readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a personal preference not to use gotos although agree this function is one where Py_DECREF(b); return res; would deduplicate and reduce lines of code.
To do that the Py_DECREF(res); would need to turn into Py_CLEAR(res) (so res is set to NULL) + a couple cases earlier could use it as well... Not planning to do that but can if there's a strong press to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just kidding. In general, I also like using goto in C to factorize code to cleanup resources before exit. Here it's just a matter of taste. I'm fine with the current code 😀
Co-authored-by: Victor Stinner <vstinner@python.org>
          
 I prefer to not backport the change. Existing code may just work by luck and this change making Python more strict can break it.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
        
          
                Misc/NEWS.d/next/Library/2025-10-25-21-04-00.gh-issue-140607.oOZGxS.rst
              
                Outdated
          
            Show resolved
            Hide resolved
        
      …OZGxS.rst Co-authored-by: Victor Stinner <vstinner@python.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| 
           Sorry, @cmaloney and @vstinner, I could not cleanly backport this to   | 
    
| 
           Sorry, @cmaloney and @vstinner, I could not cleanly backport this to   | 
    
          
 I read again the issue and I missed the  @cmaloney: Can you try to backport the change to 3.14? The automated backport failed.  | 
    
| 
           👍 will work on backports today  | 
    
| 
           GH-140728 is a backport of this pull request to the 3.14 branch.  | 
    
pythonGH-140611) While `RawIOBase.readinto` should return a count of bytes between 0 and the length of the given buffer, it is not required to. Add validation inside RawIOBase.read() that the returned byte count is valid. (cherry picked from commit 0f0a362) Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com> Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org>
pythonGH-140611) While `RawIOBase.readinto` should return a count of bytes between 0 and the length of the given buffer, it is not required to. Add validation inside RawIOBase.read() that the returned byte count is valid. (cherry picked from commit 0f0a362) Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com> Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org>
| 
           GH-140730 is a backport of this pull request to the 3.13 branch.  | 
    
…140611) (#140728) * [3.14] gh-140607: Validate returned byte count in RawIOBase.read (GH-140611) While `RawIOBase.readinto` should return a count of bytes between 0 and the length of the given buffer, it is not required to. Add validation inside RawIOBase.read() that the returned byte count is valid. (cherry picked from commit 0f0a362) Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com> Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org> * fixup: Use older attribute name --------- Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org>
…140611) (#140730) * [3.13] gh-140607: Validate returned byte count in RawIOBase.read (GH-140611) While `RawIOBase.readinto` should return a count of bytes between 0 and the length of the given buffer, it is not required to. Add validation inside RawIOBase.read() that the returned byte count is valid. (cherry picked from commit 0f0a362) Co-authored-by: Cody Maloney <cmaloney@users.noreply.github.com> Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org> * fixup: Use older attribute name --------- Co-authored-by: Shamil <ashm.tech@proton.me> Co-authored-by: Victor Stinner <vstinner@python.org>

While
readintoimplementations should return a count of bytes between 0 and the length of the given buffer, they are not required to. Add validation insideRawIOBase.readthat the returned byte count is reasonable.