Skip to content

Conversation

@cmaloney
Copy link
Contributor

@cmaloney cmaloney commented Nov 10, 2025

Account for when self->pos is equal to PY_SSIZE_T_MAX. The length of read was correctly set to zero but the asserts assumed self->pos couldn't reach PY_SSIZE_T_MAX. Return early to avoid edge cases.

Account for when self->pos is equal to PY_SSIZE_T_MAX. The length of
read was correctly set to zero but the asserts assumed
self->pos couldn't reach PY_SSIZE_T_MAX. Return early to avoid edge
cases.
}
}

assert(self->pos + len < PY_SSIZE_T_MAX);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it enough to fix this assertion?

Suggested change
assert(self->pos + len < PY_SSIZE_T_MAX);
assert(self->pos + len <= PY_SSIZE_T_MAX);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will but I worry about the later PyBytes_AS_STRING(self->buf) + self->pos as self->pos is PY_SSIZE_T_MAX and adding a large integer to a pointer then doing a memcpy from that computed address (even 0 bytes) feels like a path to undefined behavior to me. Happy to implement whichever you'd prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. If adding PY_SSIZE_T_MAX causes troubles, adding slightly smaller value will also cause troubles. Actually, if pos larger than the size of the array, ot is an undefined behaviour. So, a specoal case for empty read is needed too.

For `read_bytes_lock_held` the `size` parameter is always set to 0 by calling
code currently if self->pos > self->string_size. I added a range check for
self->pos as an extra safety as codepaths chagne but can remove if it's too much.

`write_bytes_lock_held`: `endpos` is a `size_t` so can hold
`2 * PY_SSIZE_T_MAX`. Value is bounds checked in `resize_buffer_lock_held`.

A number of cases use `scan_eol_lock_held` to move forward. That has code which
checks `self->pos >= self->string_size` and returns `0`. Callsites all seem to
handle that correctly.
@cmaloney
Copy link
Contributor Author

I audited the codepaths which did self->pos + buffer.

For read_bytes_lock_held the size parameter is always set to 0 by calling code currently if self->pos > self->string_size. I added a range check for self->pos as an extra safety as codepaths change but can remove if it's too much.

write_bytes_lock_held: endpos is a size_t so can hold 2 * PY_SSIZE_T_MAX. Value is bounds checked in resize_buffer_lock_held.

A number of cases use scan_eol_lock_held to move forward. That has code which checks self->pos >= self->string_size and returns 0. Callers seem to handle that correctly (early exit or call read_bytes_lock_held)..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants