ICU-23395 Fix out-of-bounds read in expandCompositCharAtNear()#3963
ICU-23395 Fix out-of-bounds read in expandCompositCharAtNear()#3963TristanInSec wants to merge 1 commit into
Conversation
39a7a26 to
1e6dc81
Compare
|
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
|
Please
|
|
Hi @markusicu, Done:
Note that this is a different bug class from the deserialization issues in #3961/#3962/#3964. The off-by-one in Best regards, |
|
Hi @TristanInSec , thank you! I started the CI checks, there are failures. Please take a look. Also, we have a Java version of this API: Would you mind taking a look at the Java code to see if it has similar buffer handling, and adding an equivalent unit test case? |
|
Hi @markusicu, thanks for the review! I fixed the CI failures -- the test was incorrectly treating I also checked the Java |
226fd33 to
6180349
Compare
|
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
|
Hi @markusicu, Checked the Java counterpart. The Java The other Could you re-trigger the CI on the latest commit when you get a chance? The test fix from the last push should resolve the earlier failures. Thanks, |
|
Hi, apologies for the delayed response. In general it would be a good idea to try to make your inputs as simple as possible so that we can actually tell what the desired behaviour would be in the test case. It should be possible (indeed it is possible, see below) to reproduce the issue on something simpler than your In addition, this test case surely does not exercise an out-of-bounds read; icu/icu4c/source/common/ushape.cpp Line 1501 in 5650473 which is comfortably larger than your sourceLength of 86.
I thought this might be an uninitialized read of icu/icu4c/source/common/ushape.cpp Line 1590 in 5650473 So in your test case, that There does seem to be an out-of-bounds read when sourceLength=300 or sourceLength≥301 (two different paths, either past the stack Please write a test that actually exercises that out-of-bounds read (both paths), with a test string that would actually succeed so we don’t get confused by the error code. (299 or 300 spaces followed by a a lam-alef ligature will do the job). |
Add bounds check (i < sourceLength - 1) before accessing dest[i+1] in the lamAlef expansion path. Without this, the loop reads 2 bytes past the buffer when i == sourceLength-1. Add C regression test in cbiditst.c and Java regression test in ArabicShapingRegTest.java. The Java implementation uses backward iteration with dest[i-1] guarded by i > start, so it is not affected. The Java test is added as a regression guard.
6180349 to
3a23cec
Compare
|
Thanks for the thorough review, you're absolutely right. I rewrote the test to exercise both buffer paths:
Both calls succeed without error. Updated the Java test with the same inputs for consistency. |
Add
i < sourceLength - 1bounds check before thedest[i+1]access inexpandCompositCharAtNear(). The existing loop bound allowsito reachsourceLength-1, makingdest[i+1]read one element past the allocation.Checklist