http.client derives the response body framing from int() of the Content-Length header and the chunked chunk-size line:
HTTPResponse.begin: self.length = int(length)
HTTPResponse._read_next_chunk_size: return int(line, 16)
RFC 9112 defines Content-Length = 1*DIGIT and chunk-size = 1*HEXDIG, but int() is more permissive: it accepts a leading +/-, underscores, surrounding whitespace and, in base 16, an 0x prefix and non-ASCII digits. So values like Content-Length: +5 / 5_0 and chunk sizes -5, +5, 0x5, 1_f are accepted and used to frame the body, while an RFC-compliant front end would reject them or frame the message differently (CWE-444).
Reproducer:
import http.client, io
class S:
def __init__(s, d): s.f = io.BytesIO(d)
def makefile(s, *a, **k): return s.f
def parse(raw):
r = http.client.HTTPResponse(S(raw)); r.begin(); return r
raw = b'HTTP/1.1 200 OK\r\nTransfer-Encoding: chunked\r\n\r\n+5\r\nHELLO\r\n0\r\n\r\n'
print(parse(raw).read()) # b'HELLO' -- '+5' is not a HEXDIG
raw = b'HTTP/1.1 200 OK\r\nContent-Length: 5_0\r\n\r\n' + b'A'*50
print(parse(raw).length) # 50 -- '5_0' is not 1*DIGIT
The body-framing tokens should be validated against the grammar before being passed to int().
Linked PRs
http.clientderives the response body framing fromint()of theContent-Lengthheader and the chunkedchunk-sizeline:HTTPResponse.begin:self.length = int(length)HTTPResponse._read_next_chunk_size:return int(line, 16)RFC 9112 defines
Content-Length = 1*DIGITandchunk-size = 1*HEXDIG, butint()is more permissive: it accepts a leading+/-, underscores, surrounding whitespace and, in base 16, an0xprefix and non-ASCII digits. So values likeContent-Length: +5/5_0and chunk sizes-5,+5,0x5,1_fare accepted and used to frame the body, while an RFC-compliant front end would reject them or frame the message differently (CWE-444).Reproducer:
The body-framing tokens should be validated against the grammar before being passed to
int().Linked PRs