Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer overrun in Lexer with string literal longer than 4096 characters #316

andreasabel opened this issue Nov 7, 2020 · 0 comments
bug C++ C lexer Concerning the generated lexer


Copy link

The generated lexer uses a fixed buffer that can overrun when lexing a long string literal:

"void YY_BUFFER_APPEND(char *s)",
" strcat(YY_PARSED_STRING, s); /* Do something better here! */",

(The authors were probably aware of the problem but did not care to fix it.)
Currently the length is fixed to 4096 characters, a longer string (5000 characters) overruns the buffer, leading e.g. to a crash (C) or a parse failure (C++).
This affects the C and C++ backends.

@andreasabel andreasabel added bug C++ C lexer Concerning the generated lexer labels Nov 7, 2020
@andreasabel andreasabel added this to the 2.9 milestone Nov 7, 2020
andreasabel added a commit that referenced this issue Nov 13, 2020
This bug surface when setting the buffer size to 2048 (previously it
was 2000 and going one byte over went unnoticed).

We need to maintain

  cur_ < buf_size

(not just cur_ <= buf_size).
@andreasabel andreasabel self-assigned this Nov 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
bug C++ C lexer Concerning the generated lexer
None yet

No branches or pull requests

1 participant