Introduce GBNF grammar parameter: run flags, REPL and API #2754

zopieux · 2024-02-26T00:13:01Z

This patch adds full support for the llama.cpp "grammar" option, which defaults to "" i.e. no grammar constraint.

For backwards compatability, the existing format parameter remains available in flags, the REPL and the API. Only one of grammar or format at a time is legal, since format: json is just a shortcut for the hard-coded JSON grammar.

Tested locally:

$ ./ollama run mistral-openorca --grammar-file json-two-string-array.bnf \
  'Write the result of "3+5" followed by the english spelling of the number, as two JSON strings'
["8", "Eight"]

$ ./ollama run mistral-openorca --grammar ' root ::= "[" [0-9]+ ", \"" [a-z]+ "\"]" ' \
  'Write the result of "3+5" followed by the english spelling of the number, as a json number and a json string'
[8, "eight"]

This more or less conflicts with #2404 which instead decided to go with an Options, which AFAICT is more geared towards creating derived models with an embedded grammar, whereas this patch is more about runtime override. It would be nice to eventually converge on something that does both, but today the concept of "format", which is very limiting (list of well-known formats and only JSON today), is not an Options but a top-level parameter and I wasn't sure how to make this play nice with #2404 without too many changes.

i404788 · 2024-03-06T16:02:14Z

Bug report when testing:

When provided with an invalid grammar (for the llama.cpp backend) it will stop stop responding to new requests to generate. It seems like ollama does not detect that it has failed and will infinitely wait for tokens.

zopieux · 2024-03-06T18:18:21Z

Bug report when testing:

When provided with an invalid grammar (for the llama.cpp backend) it will stop stop responding to new requests

Yeah in fact it literally segfaults and dumps core. However fixing this is beyond the scope of this PR as ollama itself can't do validation. It should however gracefully handle this failure.

clevcode · 2024-03-09T10:53:24Z

I would suggest making a simple stand alone GBNF grammar validator based on llama.cpp sources (as a C-exported function, that can be linked and used directly by ollama with a cgo-binding) and pre-validate the grammar before invoking llama.cpp. Simple to do, and essential if llama.cpp does not graciously handle invalid grammars itself.

This patch adds full support for the llama.cpp "grammar" option, which defaults to "" i.e. no grammar constraint. For backwards compatability, the existing `format` parameter remains available in flags, the REPL and the API. Only one of `grammar` or `format` at a time is legal, since `format: json` is just a shortcut for the hard-coded JSON grammar. Tested locally.

zopieux force-pushed the grammar branch from be04610 to d742135 Compare February 26, 2024 00:17

zopieux force-pushed the grammar branch from d742135 to 4216ff7 Compare March 9, 2024 13:23

zopieux force-pushed the grammar branch from 4216ff7 to af30144 Compare April 14, 2024 16:16

grantCelley mentioned this pull request May 5, 2024

Grammar Guided response from model. #4074

Open

zutto mentioned this pull request Jun 10, 2024

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce GBNF grammar parameter: run flags, REPL and API #2754

Introduce GBNF grammar parameter: run flags, REPL and API #2754

zopieux commented Feb 26, 2024 •

edited

Loading

i404788 commented Mar 6, 2024

zopieux commented Mar 6, 2024

clevcode commented Mar 9, 2024

Introduce GBNF grammar parameter: run flags, REPL and API #2754

Are you sure you want to change the base?

Introduce GBNF grammar parameter: run flags, REPL and API #2754

Conversation

zopieux commented Feb 26, 2024 • edited Loading

i404788 commented Mar 6, 2024

zopieux commented Mar 6, 2024

clevcode commented Mar 9, 2024

zopieux commented Feb 26, 2024 •

edited

Loading