Fix grammar parsing issues to prevent stack overflow and hangs#18604
Fix grammar parsing issues to prevent stack overflow and hangs#18604pwilkin merged 5 commits intoggml-org:masterfrom
Conversation
|
This fixes #19845 |
|
@ggerganov This has been stuck for a while, can you take a look and let us know how to proceed? |
|
@pwilkin Would you like to take a look and review? |
|
@ggerganov Aye, can look. |
pwilkin
left a comment
There was a problem hiding this comment.
Please run editorchecker and fix the indentation issues, otherwise looks fine. See also my changes to the seen and let me know if you approve.
Reproduce stack overflow (or OOM) with ( [x]* )* found while adding
GBNF support to ripgrep-edit.
llama-server reproducer:
curl \
-X POST \
-d '{
"messages": [{ "role": "user", "content": "write yes" }],
"grammar": "root ::= ( [x]* )*"
}' \
-H "Content-Type: application/json" \
http://localhost:8811/v1/chat/completions
Fix a potential stack overflow in llama_grammar_advance_stack that
could occur when processing grammars with nullable symbols that lead
to infinite derivations of empty strings. The fix introduces cycle
detection by tracking visited stacks to prevent infinite recursion.
rg-edit regexp: llama_grammar_advance_stack
rg-edit extra-args: -A20
rg-edit directive: """Rewrite: fix the following segfault:
[..]
⚫ Testing segfault. Grammar:
root ::= ( [x]* )*
root ::= ( [x]* )*
Segmentation fault build/bin/test-grammar-integration"""
gptel-context:
(("~/llama.cpp/src/llama-grammar.cpp")
("~/llama.cpp/tests/test-grammar-integration.cpp")
("~/llama.cpp/grammars/./list.gbnf")
("~/llama.cpp/grammars/./json_arr.gbnf")
("~/llama.cpp/grammars/./json.gbnf")
("~/llama.cpp/grammars/./japanese.gbnf")
("~/llama.cpp/grammars/./english.gbnf")
("~/llama.cpp/grammars/./chess.gbnf")
("~/llama.cpp/grammars/./c.gbnf")
("~/llama.cpp/grammars/./arithmetic.gbnf")
("~/llama.cpp/grammars/./README.md"))
This change converts the function to an iterative approach using
explicit stacks, which prevents deep recursion and eliminates the risk
of stack overflow.
rg-edit regexp: llama_grammar_advance_stack
rg-edit extra-args: -A30
rg-edit directive: """Rewrite: fix the following segfault:
[..]
⚫ Testing segfault. Grammar:
root ::= ( [x]* )*
root ::= ( [x]* )*
Segmentation fault build/bin/test-grammar-integration
convert from recursive to interactive"""
gptel-context:
(("~/llama.cpp/src/llama-grammar.cpp")
("~/llama.cpp/tests/test-grammar-integration.cpp")
("~/llama.cpp/grammars/./list.gbnf")
("~/llama.cpp/grammars/./json_arr.gbnf")
("~/llama.cpp/grammars/./json.gbnf")
("~/llama.cpp/grammars/./japanese.gbnf")
("~/llama.cpp/grammars/./english.gbnf")
("~/llama.cpp/grammars/./chess.gbnf")
("~/llama.cpp/grammars/./c.gbnf")
("~/llama.cpp/grammars/./arithmetic.gbnf")
("~/llama.cpp/grammars/./README.md"))
v2: Added a `std::set` to perform tree-based lookups with O(N log N)
complexity. Testing with a parallel run of `test-grammar-integration`
shows a double-digit percentage increase in runtime. An
`unordered_set` with O(1) hashing was also evaluated, but the overhead
of constructing hash keys from pointers made it significantly slower
than the rbtree implementation that only requires an ordering
operator. The performance regression in the test suite appears
justified by the overall reduction in algorithmic complexity.
Co-developed-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
This commit adds a new test case to the grammar integration tests that
specifically targets a hang scenario in the repetition grammar parser
found while adding GBNF support to ripgrep-edit.
llama-server reproducer:
curl \
-X POST \
-d '{
"messages": [{ "role": "user", "content": "write yes" }],
"grammar": "root ::= (([^x]*){0,99}){0,99}"
}' \
-H "Content-Type: application/json" \
http://localhost:8811/v1/chat/completions
The change introduces a maximum repetition threshold to avoid
excessive rule expansion during grammar parsing. When parsing
repetition patterns like {m,n}, the parser now calculates the
potential number of rules that would be generated and throws an error
if the product of previous rules and new rules exceeds the threshold.
A test case was added to verify the threshold is properly enforced for
deeply nested repetition patterns that would otherwise cause hangs.
Sure the set addition looks good. I would have kept it incremental, but the UI seems to suggest to fold it, so I folded it into the patch, I don't mind either ways. I also tried an unordered_set, but that requires building a key from all pointers in the stack vector and using test-grammar-integration as benchmark it was slower than the rbtree in the set. The set is also slower than the original linear vector but less (around 13% increase in runtime). |
|
"I also tried an unordered_set, but that requires building a key from all pointers in the stack vector and using test-grammar-integration as benchmark it was slower than the rbtree in the set." Yeah tried that as well but it required too much setup with the hash function. 13% is fine if it helps us prevent catastrophic times with some very big grammars. |
|
Oh, I'm sorry, should've pinged me earlier :) |
This pull request addresses some issues in the grammar parsing system that could lead to stack overflow and hangs when processing certain GBNF grammars. The fixes include:
Stack overflow prevention: Added cycle detection in
llama_grammar_advance_stackto prevent infinite recursion when processing grammars with nullable symbols that could lead to infinite derivations of empty strings.Iterative implementation: Converted the recursive
llama_grammar_advance_stackfunction to an iterative approach using explicit stacks, which eliminates the risk of stack overflow from deep recursion.Repetition threshold checking: Added a maximum repetition threshold to prevent excessive rule expansion during grammar parsing of deeply nested repetition patterns like
{m,n}.The repetition threshold value hasn't changed, but it now applies to all nested rules so it makes some valid grammar invalid, supposedly such previously valid grammars would hang or stack overflow.
Testing
The changes have been tested with:
New test cases have been added to verify:
( [x]* )*grammar is fixed