Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Segfault in ocamllex-generated code using 'shortest' #7760
Original bug ID: 7760
On my machine (amd64 Debian), the following program usually segfaults:
when compiled and run as:
This example is reduced from a larger lexer. The segfault only seems to occur when using 'shortest' instead of 'parse', but I'm not sure exactly which combination of features triggers the bug. The problem is reproducible using OCaml versions back to at least 3.11.2.
Comment author: @let-def
I started investigating this issue.
The problem triggers when one branch capture sub-values (the
The automaton produced is correct (though not minimal :)), that's why the
If you don't capture sub-values, the lexer will use the
However, if one of the branch capture sub-values,
Btw, this is not an initialization issue (one could think that the position vector is too short), it is because of the wrong interpretation of a tag which consumes garbage values and writes at some arbitrary offset of lex_mem.
My next step will be to instrument bytecode generation to understand what goes wrong, but I progress slowly as I found few resources on that part :).
Comment author: @maranget
I think I have found the bug, but I am lacking time to submit
Basically, the problem originates from the table compaction function being