Please sign in to comment.
Fix lexing of nested heredoc strings in token_get_all()
This fixes bug #60097. Before two global variables CG(heredoc) and CG(heredoc_len) were used to track the current heredoc label. In order to support nested heredoc strings the *previous* heredoc label was assigned as the token value of T_START_HEREDOC and the language_parser.y assigned that to CG(heredoc). This created a dependency of the lexer on the parser. Thus the token_get_all() function, which accesses the lexer directly without also running the parser, was not able to tokenize nested heredoc strings (and leaked memory). Same applies for the source-code highlighting functions. The new approach is to maintain a heredoc_label_stack in the lexer, which contains all active heredoc labels. As it is no longer required, T_START_HEREDOC and T_END_HEREDOC now don't carry a token value anymore. In order to make the work with zend_ptr_stack in this context more convenient I added a new function zend_ptr_stack_top(), which retrieves the top element of the stack (similar to zend_stack_top()).
- Loading branch information...
Showing with 561 additions and 425 deletions.
- +3 −0 NEWS
- +0 −3 Zend/zend_compile.c
- +1 −3 Zend/zend_globals.h
- +0 −2 Zend/zend_highlight.c
- +3 −3 Zend/zend_language_parser.y
- +381 −375 Zend/zend_language_scanner.c
- +5 −0 Zend/zend_language_scanner.h
- +40 −34 Zend/zend_language_scanner.l
- +1 −1 Zend/zend_language_scanner_defs.h
- +5 −0 Zend/zend_ptr_stack.h
- +121 −0 ext/tokenizer/tests/bug60097.phpt
- +1 −4 ext/tokenizer/tokenizer.c
Oops, something went wrong.