Found these in src/cpp-flex.skl while writing a libFuzzer harness against the runtime tables loader. All three trigger from yytables_fload(fp, scanner) (the public API a consumer calls when %option tables-file= is used). Reproduced against the unmodified loader in master @ 13c8018; same code is present in 2.6.4 unchanged.
Two heap-OOB reads and one integer-driven multi-GB allocation. ASAN traces below; PoC bytes are at the bottom of this issue.
1. heap-OOB read in strlen (yytbl_hdr_read)
src/cpp-flex.skl:4010
th->th_name = th->th_version + strlen (th->th_version) + 1;
th_version was just fread'd (line 4002) into a yyalloc(bytes) buffer (line 3997), where bytes = th_hsize - 14 is set at line 3996. Nothing guarantees those bytes contain a NUL. The header sanity check at line 3990 only requires hsize >= 16, so the smallest payload is 2 bytes. With those 2 bytes set to 'A','A', strlen walks past the heap allocation.
ASAN with a 16-byte input file:
==58898==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000d2
READ of size 3 at 0x6020000000d2 thread T0
#0 strlen
#1 cl_tables_fload lex.cl_.c:2701
#2 main poc_strlen_oob.c:101
0x6020000000d2 is located 0 bytes after 2-byte region
allocated by thread T0 here:
#0 malloc
#1 cl_tables_fload lex.cl_.c:2701
(The lex.cl_.c:2701 frame is the public cl_tables_fload after Apple clang inlined yytbl_hdr_read and yytbl_fload. Internal frames separate at -O0 -fno-inline.)
2. heap-OOB read in strcmp (yytbl_fload)
src/cpp-flex.skl:4278
if (strcmp(th.th_name, key) != 0) {
Different root cause from #1. Even when the version section IS NUL-terminated, th_name is set unconditionally to th_version + strlen + 1. If the version's NUL lands at the LAST byte of the buffer, th_name points just past the heap region, and strcmp walks past it. PoC: 16-byte input where hsize=16, payload is 'A','\0'. ASAN reports READ of size 1 past a 2-byte region, attributed to cl_tables_fload.
The fix for #1 and #2 is the same shape: validate that th_version and th_name each find a NUL inside the buffer. Roughly:
size_t vlen = strnlen(th_version, bytes);
if (vlen >= bytes - 1) { yyfree(th_version,...); return -1; }
size_t nlen = strnlen(th_version + vlen + 1, bytes - vlen - 1);
if (nlen >= bytes - vlen - 1) { yyfree(th_version,...); return -1; }
th_name = th_version + vlen + 1;
3. integer-driven multi-GB allocation (yytbl_data_load)
src/cpp-flex.skl:4070-4072
if ((td.td_flags & YYTD_STRUCT)) {
bytes = sizeof(struct yy_trans_info) * td.td_lolen * (td.td_hilen ? td.td_hilen : 1);
} else {
bytes = td.td_lolen * (td.td_hilen ? td.td_hilen : 1) * dmap->dm_sz;
}
... yyalloc(bytes, ...);
td_lolen and td_hilen are flex_uint32_t read straight from the file, no validation. A 50-byte tables file with td_lolen = 0xF1238000 makes the loader call malloc(8091271168) (7.54 GB) for one record. On any consumer with realistic memory limits (containers, embedded, mobile) this is fatal: either the OOM-killer takes the process or malloc returns NULL and YY_FATAL_ERROR("out of dynamic memory in yytbl_data_load()") calls exit().
There's already a precedent fix for the build-time tool's allocators in commit 9c54eb6 build: detect overflow for [re]allocate_array. The runtime loader was never given the same guard. A two-line clamp before the multiplication would mirror that commit.
Reproduction
PoCs are standalone C main() programs that link the unmodified lex.cl_.c from flex --tables-file=cl.tab sample.l:
clang -g -O1 -fno-omit-frame-pointer \
-fsanitize=address,undefined -fno-sanitize-recover=undefined \
poc_strlen_oob.c lex.cl_.c -o poc_strlen_oob
ASAN_OPTIONS=detect_leaks=0:halt_on_error=1 ./poc_strlen_oob
All three reproduce in <10ms through the public cl_tables_fload (≡ yytables_fload). PoC files attached. Affects only scanners generated with %option tables-file= or --tables-file=.
Suggested fixes
- and 2.: replace
strlen with bounded strnlen and verify both NULs fall inside the buffer (snippet above).
- clamp
td_lolen and td_hilen to a sane upper bound before the multiplication, in the same spirit as 9c54eb6. Something like if (td_lolen > YYTBL_MAX_DIM || td_hilen > YYTBL_MAX_DIM) return -1; with YYTBL_MAX_DIM set to whatever real scanners actually need (a few million is more than enough; 32k DFA-states is the documented limit).
Happy to send a PR covering all three if you'd like.
Found via OSS-Fuzz-shaped libFuzzer harness, expanding what the upstream OSS-Fuzz projects/flex/ integration covers (the upstream flex-patch.diff neuters the m4 fork, which incidentally also keeps the loader off the fuzzed code path; that's why none of these have surfaced from continuous fuzzing).
Found these in
src/cpp-flex.sklwhile writing a libFuzzer harness against the runtime tables loader. All three trigger fromyytables_fload(fp, scanner)(the public API a consumer calls when%option tables-file=is used). Reproduced against the unmodified loader in master @ 13c8018; same code is present in 2.6.4 unchanged.Two heap-OOB reads and one integer-driven multi-GB allocation. ASAN traces below; PoC bytes are at the bottom of this issue.
1. heap-OOB read in
strlen(yytbl_hdr_read)src/cpp-flex.skl:4010th_versionwas justfread'd (line 4002) into ayyalloc(bytes)buffer (line 3997), wherebytes = th_hsize - 14is set at line 3996. Nothing guarantees those bytes contain a NUL. The header sanity check at line 3990 only requireshsize >= 16, so the smallest payload is 2 bytes. With those 2 bytes set to'A','A',strlenwalks past the heap allocation.ASAN with a 16-byte input file:
(The
lex.cl_.c:2701frame is the publiccl_tables_floadafter Apple clang inlinedyytbl_hdr_readandyytbl_fload. Internal frames separate at-O0 -fno-inline.)2. heap-OOB read in
strcmp(yytbl_fload)src/cpp-flex.skl:4278Different root cause from #1. Even when the version section IS NUL-terminated,
th_nameis set unconditionally toth_version + strlen + 1. If the version's NUL lands at the LAST byte of the buffer,th_namepoints just past the heap region, andstrcmpwalks past it. PoC: 16-byte input wherehsize=16, payload is'A','\0'. ASAN reportsREAD of size 1past a 2-byte region, attributed tocl_tables_fload.The fix for #1 and #2 is the same shape: validate that
th_versionandth_nameeach find a NUL inside the buffer. Roughly:3. integer-driven multi-GB allocation (yytbl_data_load)
src/cpp-flex.skl:4070-4072td_lolenandtd_hilenareflex_uint32_tread straight from the file, no validation. A 50-byte tables file withtd_lolen = 0xF1238000makes the loader callmalloc(8091271168)(7.54 GB) for one record. On any consumer with realistic memory limits (containers, embedded, mobile) this is fatal: either the OOM-killer takes the process ormallocreturns NULL andYY_FATAL_ERROR("out of dynamic memory in yytbl_data_load()")callsexit().There's already a precedent fix for the build-time tool's allocators in commit
9c54eb6 build: detect overflow for [re]allocate_array. The runtime loader was never given the same guard. A two-line clamp before the multiplication would mirror that commit.Reproduction
PoCs are standalone C
main()programs that link the unmodifiedlex.cl_.cfromflex --tables-file=cl.tab sample.l:All three reproduce in <10ms through the public
cl_tables_fload(≡yytables_fload). PoC files attached. Affects only scanners generated with%option tables-file=or--tables-file=.Suggested fixes
strlenwith boundedstrnlenand verify both NULs fall inside the buffer (snippet above).td_lolenandtd_hilento a sane upper bound before the multiplication, in the same spirit as9c54eb6. Something likeif (td_lolen > YYTBL_MAX_DIM || td_hilen > YYTBL_MAX_DIM) return -1;withYYTBL_MAX_DIMset to whatever real scanners actually need (a few million is more than enough; 32k DFA-states is the documented limit).Happy to send a PR covering all three if you'd like.
Found via
OSS-Fuzz-shaped libFuzzer harness, expanding what the upstream OSS-Fuzzprojects/flex/integration covers (the upstreamflex-patch.diffneuters the m4 fork, which incidentally also keeps the loader off the fuzzed code path; that's why none of these have surfaced from continuous fuzzing).