Fix grammar parsing issues to prevent stack overflow and hangs by aagit · Pull Request #18604 · ggml-org/llama.cpp

aagit · 2026-01-05T02:16:11Z

This pull request addresses some issues in the grammar parsing system that could lead to stack overflow and hangs when processing certain GBNF grammars. The fixes include:

Stack overflow prevention: Added cycle detection in llama_grammar_advance_stack to prevent infinite recursion when processing grammars with nullable symbols that could lead to infinite derivations of empty strings.
Iterative implementation: Converted the recursive llama_grammar_advance_stack function to an iterative approach using explicit stacks, which eliminates the risk of stack overflow from deep recursion.
Repetition threshold checking: Added a maximum repetition threshold to prevent excessive rule expansion during grammar parsing of deeply nested repetition patterns like {m,n}.

The repetition threshold value hasn't changed, but it now applies to all nested rules so it makes some valid grammar invalid, supposedly such previously valid grammars would hang or stack overflow.

Testing

The changes have been tested with:

Existing test suites (test-llama-grammar, test-grammar-integration, test-grammar-parser)
The two llama-server curl reproducers mentioned in the commits
Manual verification with some ripgrep-edit sessions with GBNF enabled

New test cases have been added to verify:

The stack overflow case with ( [x]* )* grammar is fixed
The hang case with deeply nested repetition patterns is rejected

fiesh · 2026-02-26T17:20:28Z

This fixes #19845

0cc4m · 2026-03-10T15:42:22Z

@ggerganov This has been stuck for a while, can you take a look and let us know how to proceed?

ggerganov · 2026-03-10T16:48:17Z

@pwilkin Would you like to take a look and review?

pwilkin · 2026-03-10T17:24:31Z

@ggerganov Aye, can look.

pwilkin

Please run editorchecker and fix the indentation issues, otherwise looks fine. See also my changes to the seen and let me know if you approve.

Reproduce stack overflow (or OOM) with ( [x]* )* found while adding GBNF support to ripgrep-edit. llama-server reproducer: curl \ -X POST \ -d '{ "messages": [{ "role": "user", "content": "write yes" }], "grammar": "root ::= ( [x]* )*" }' \ -H "Content-Type: application/json" \ http://localhost:8811/v1/chat/completions

Fix a potential stack overflow in llama_grammar_advance_stack that could occur when processing grammars with nullable symbols that lead to infinite derivations of empty strings. The fix introduces cycle detection by tracking visited stacks to prevent infinite recursion. rg-edit regexp: llama_grammar_advance_stack rg-edit extra-args: -A20 rg-edit directive: """Rewrite: fix the following segfault: [..] ⚫ Testing segfault. Grammar: root ::= ( [x]* )* root ::= ( [x]* )* Segmentation fault build/bin/test-grammar-integration""" gptel-context: (("~/llama.cpp/src/llama-grammar.cpp") ("~/llama.cpp/tests/test-grammar-integration.cpp") ("~/llama.cpp/grammars/./list.gbnf") ("~/llama.cpp/grammars/./json_arr.gbnf") ("~/llama.cpp/grammars/./json.gbnf") ("~/llama.cpp/grammars/./japanese.gbnf") ("~/llama.cpp/grammars/./english.gbnf") ("~/llama.cpp/grammars/./chess.gbnf") ("~/llama.cpp/grammars/./c.gbnf") ("~/llama.cpp/grammars/./arithmetic.gbnf") ("~/llama.cpp/grammars/./README.md"))

This change converts the function to an iterative approach using explicit stacks, which prevents deep recursion and eliminates the risk of stack overflow. rg-edit regexp: llama_grammar_advance_stack rg-edit extra-args: -A30 rg-edit directive: """Rewrite: fix the following segfault: [..] ⚫ Testing segfault. Grammar: root ::= ( [x]* )* root ::= ( [x]* )* Segmentation fault build/bin/test-grammar-integration convert from recursive to interactive""" gptel-context: (("~/llama.cpp/src/llama-grammar.cpp") ("~/llama.cpp/tests/test-grammar-integration.cpp") ("~/llama.cpp/grammars/./list.gbnf") ("~/llama.cpp/grammars/./json_arr.gbnf") ("~/llama.cpp/grammars/./json.gbnf") ("~/llama.cpp/grammars/./japanese.gbnf") ("~/llama.cpp/grammars/./english.gbnf") ("~/llama.cpp/grammars/./chess.gbnf") ("~/llama.cpp/grammars/./c.gbnf") ("~/llama.cpp/grammars/./arithmetic.gbnf") ("~/llama.cpp/grammars/./README.md")) v2: Added a `std::set` to perform tree-based lookups with O(N log N) complexity. Testing with a parallel run of `test-grammar-integration` shows a double-digit percentage increase in runtime. An `unordered_set` with O(1) hashing was also evaluated, but the overhead of constructing hash keys from pointers made it significantly slower than the rbtree implementation that only requires an ordering operator. The performance regression in the test suite appears justified by the overall reduction in algorithmic complexity. Co-developed-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>

This commit adds a new test case to the grammar integration tests that specifically targets a hang scenario in the repetition grammar parser found while adding GBNF support to ripgrep-edit. llama-server reproducer: curl \ -X POST \ -d '{ "messages": [{ "role": "user", "content": "write yes" }], "grammar": "root ::= (([^x]*){0,99}){0,99}" }' \ -H "Content-Type: application/json" \ http://localhost:8811/v1/chat/completions

The change introduces a maximum repetition threshold to avoid excessive rule expansion during grammar parsing. When parsing repetition patterns like {m,n}, the parser now calculates the potential number of rules that would be generated and throws an error if the product of previous rules and new rules exceeds the threshold. A test case was added to verify the threshold is properly enforced for deeply nested repetition patterns that would otherwise cause hangs.

aagit · 2026-03-11T14:00:05Z

Please run editorchecker and fix the indentation issues, otherwise looks fine. See also my changes to the seen and let me know if you approve.

Sure the set addition looks good. I would have kept it incremental, but the UI seems to suggest to fold it, so I folded it into the patch, I don't mind either ways. I also tried an unordered_set, but that requires building a key from all pointers in the stack vector and using test-grammar-integration as benchmark it was slower than the rbtree in the set. The set is also slower than the original linear vector but less (around 13% increase in runtime).

pwilkin · 2026-03-11T14:20:28Z

"I also tried an unordered_set, but that requires building a key from all pointers in the stack vector and using test-grammar-integration as benchmark it was slower than the rbtree in the set."

Yeah tried that as well but it required too much setup with the hash function.

13% is fine if it helps us prevent catastrophic times with some very big grammars.

pwilkin · 2026-03-21T17:43:15Z

Oh, I'm sorry, should've pinged me earlier :)

aagit requested a review from ggerganov as a code owner January 5, 2026 02:16

loci-dev mentioned this pull request Jan 5, 2026

UPSTREAM PR #18604: Fix grammar parsing issues to prevent stack overflow and hangs auroralabs-loci/llama.cpp#818

Open

github-actions bot added the testing Everything test related label Jan 5, 2026

aagit mentioned this pull request Feb 25, 2026

Misc. bug: legal GBNF grammar leads to crash and makes server non-functional until restart #19845

Open

pwilkin self-assigned this Mar 10, 2026

pwilkin requested changes Mar 10, 2026

View reviewed changes

aagit added 5 commits March 11, 2026 14:48

aagit force-pushed the grammar-fixes branch from 3a50058 to 13c8d22 Compare March 11, 2026 13:50

aagit requested a review from pwilkin March 21, 2026 17:30

pwilkin approved these changes Mar 21, 2026

View reviewed changes

pwilkin merged commit 990e4d9 into ggml-org:master Mar 21, 2026
66 of 78 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix grammar parsing issues to prevent stack overflow and hangs#18604

Fix grammar parsing issues to prevent stack overflow and hangs#18604
pwilkin merged 5 commits intoggml-org:masterfrom
aagit:grammar-fixes

aagit commented Jan 5, 2026

Uh oh!

fiesh commented Feb 26, 2026

Uh oh!

0cc4m commented Mar 10, 2026

Uh oh!

ggerganov commented Mar 10, 2026

Uh oh!

pwilkin commented Mar 10, 2026

Uh oh!

pwilkin left a comment

Uh oh!

aagit commented Mar 11, 2026

Uh oh!

pwilkin commented Mar 11, 2026

Uh oh!

pwilkin commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

aagit commented Jan 5, 2026

Testing

Uh oh!

fiesh commented Feb 26, 2026

Uh oh!

0cc4m commented Mar 10, 2026

Uh oh!

ggerganov commented Mar 10, 2026

Uh oh!

pwilkin commented Mar 10, 2026

Uh oh!

pwilkin left a comment

Choose a reason for hiding this comment

Uh oh!

aagit commented Mar 11, 2026

Uh oh!

pwilkin commented Mar 11, 2026

Uh oh!

pwilkin commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants