Sanitize all multibyte chars in HpackHuffmanEncoder#13546
Sanitize all multibyte chars in HpackHuffmanEncoder#13546normanmaurer merged 5 commits intonetty:4.1from
Conversation
codec-http2/src/main/java/io/netty/handler/codec/http2/HpackHuffmanEncoder.java
Outdated
Show resolved
Hide resolved
codec-http2/src/main/java/io/netty/handler/codec/http2/HpackEncoder.java
Outdated
Show resolved
Hide resolved
codec-http2/src/test/java/io/netty/handler/codec/http2/HpackEncoderTest.java
Show resolved
Hide resolved
|
Pr looks good to me although CodeQL is complaining, all but certainly about some extra whitespace. 😄 Can we tighten up the commit message a touch: I think the core problem was that headers that ended up Huffman encoded were sanitized differently, specifically chars with values higher than 0xFF which could result in unexpected control chars instead of the '?'. In both cases the headers are corrupted: one is just safer than the other. As an aside and as your test demonstrates, we can still emit control chars we just have to be explicit about them. |
bryce-anderson
left a comment
There was a problem hiding this comment.
Thank you @Lincong!
|
Thanks @bryce-anderson for the suggestion to improve the commit message. I have updated it and PTAL before we merge this PR! Thanks @normanmaurer for fixing style violation (here). I have not completely set up my dev environment yet so that some style violation cannot be caught locally. I will make sure I am able to catch and fix such issues in my future PRs. |
Motivation: To fix the following problem: during encoding, Huffman encoded headers are sanitized differently compared to non-Huffman encoded headers in `HpackEncoder`. As a result, characters with code point values higher than 0xFF which could be decoded to an unexpected control chars instead of `'?'`. Modification: Change how each character is sanitized in `HpackHuffmanEncoder`. Specifically, use the new approach [1] to replace the old approach [2]. [1] `AsciiString.c2b(aChar) & 0xFF` [2] `aChar & 0xFF` Expected output is `0` if `aChar > 0xFF`. But with the old approach, if `aChar == 0x4E01`, `0x4E01 & 0xFF == 1` which is incorrect. Result: All characters with code point values higher than 0xFF are decoded to `?`s regardless of whether Huffman encoding was used during encoding. Fixes #13540 --------- Co-authored-by: Norman Maurer <norman_maurer@apple.com>
|
Thanks @normanmaurer for merging this PR! Do you know an ETA for |
|
@Lincong I think sometime next week |
I am not sure if this is something appropriate to ask for, but it will be super nice if |
Motivation:
To fix the following problem: during encoding, Huffman encoded headers are sanitized differently compared to non-Huffman encoded headers in
HpackEncoder. As a result, characters with code point values higher than 0xFF which could be decoded to an unexpected control chars instead of'?'.Modification:
Change how each character is sanitized in
HpackHuffmanEncoder. Specifically, use the new approach [1] to replace the old approach [2].[1]
AsciiString.c2b(aChar) & 0xFF[2]
aChar & 0xFFExpected output is
0ifaChar > 0xFF. But with the old approach, ifaChar == 0x4E01,0x4E01 & 0xFF == 1which is incorrect.Result:
All characters with code point values higher than 0xFF are decoded to
?s regardless of whether Huffman encoding was used during encoding.Fixes #13540