Added support for specifying an arbitrary GBNF compatible grammar #1606

clevcode · 2023-12-19T14:21:53Z

in the Modelfile, for models running on the llama.cpp backend

Note that this is basically just the same PR as the one submitted by SyrupThinker in September (#565), and that has been mentioned in issue #1507 and #808 since then.

There are plenty of users that would appreciate this feature, so I really hope that it can get merged.

It's great that support for JSON grammar specifically has been added, by setting the GBNF grammar in question when JSON format is requested, but providing the user with the ability to specify an arbitrary grammar opens up for a lot more possibilities than that

Pull request #830 adds support for specifying JSON schemas, which is yet another great convenience feature for a specific and common usecase, but by adding support for arbitrary GBNF grammar it would be possible to have any model outputting data in any type of format, including custom DSLs and text-based file formats in general

This is a tremendously useful thing to have when building various types of automation related applications, so I really hope that this can get merged to avoid having to maintain separate forks. Ollama is a great project, let's keep making it even better

in the Modelfile, for models running on the llama.cpp backend Note that this is basically just the same PR as the one submitted by SyrupThinker in September (ollama#565), and that has been mentioned in issue ollama#1507 and ollama#808 since then. There are plenty of users that would appreciate this feature, so I really hope that it can get merged. It's great that support for JSON grammar specifically has been added, by setting the GBNF grammar in question when JSON format is requested, but by providing the user with the ability to specify an arbitrary grammar opens up for a lot more possibilities than that Pull request ollama#830 adds support for specifying JSON schemas, which is yet another great convenience feature for a specific and common usecase, but by adding support for arbitrary GBNF grammar it would be possible to have any model outputting data in any type of format, including custom DSLs and text-based file formats in general This is a tremendously useful thing to have when building various types of automation related applications, so I really hope that this can get merged to avoid having to maintain separate forks. Ollama is a great project, let's keep making it even better

clevcode · 2023-12-19T15:03:38Z

A really simple Modelfile example, to ensure that a model only answers with a Python code block 😄

FROM deepseek-coder
PARAMETER grammar """
root ::= "\x60\x60\x60python3\n" [^\x60]+ "\n\x60\x60\x60"
"""

clevcode · 2023-12-19T15:18:09Z

Theoretically it would even be possible to enforce that the actual code produced is syntactically valid Python code. Adding a good SYSTEM prompt helps a lot of course.

Here's an example of it in action:

$ cat Modelfile 
FROM deepseek-coder:33b-instruct-q6_K

TEMPLATE """{{ .System }}
### Instruction:
{{ .Prompt }}
### Response:
"""

SYSTEM """You are an expert coding assistant, striving for excellence in everything you do.

Respond concisely, but without leaving out important details. Skip caveats and explanations that are obvious to advanced users though. Step-by-step thinking, logically and analytically. Strive to provide the best possible solutions.

Use complete markdown-based code blocks that can be passed directly to a Python interpreter.  Always try to provide complete solutions to whatever is being asked without anything similar to "TODO" comments. That being said, only address the specific request provided and assume that anything else has already been taken care of.
"""

PARAMETER grammar """
root ::= "\x60\x60\x60python3\n" [^\x60]+ "\n\x60\x60\x60"
"""

PARAMETER num_ctx 16384
$ ollama create coder-python
transferring model data 
reading model metadata 
creating template layer 
creating system layer 
creating parameters layer 
creating config layer 
using already created layer sha256:cee2b20336444a7fc764ae4a31d7c3ca135a2fab233714b15dd230aff93a7010 
using already created layer sha256:a3a0e9449cb691a12f4de1d03725fd41326614fdeaf5d80b28c51187da0bed0e 
using already created layer sha256:602d4199b3b775f993839cf879c0633c266b8e3dd07f18c51ce68754abd609dd 
using already created layer sha256:8893e08fa9f91f7dc39e24d27bdfaece4e9c86bb3269293ff8cea6cba98c872d 
using already created layer sha256:584fd87f75335d530f3f26e6f27c38cb4d98204ffd5161acf710d22d17b68e31 
using already created layer sha256:179c66e0d123a43313f24669830090abc1981994ef663e6720d4d5b862cd6201 
using already created layer sha256:0667a8032296b8d28afab2b222f7aaf91bd6dbac28cd05910aef5bf901e3b4ad 
writing manifest 
success 
$ ollama run coder-python print the 100th fibonacci number | grep -v '^```' | python3
218922995834555169026

clevcode · 2023-12-19T15:58:58Z

Another example:

$ wget https://raw.githubusercontent.com/ggerganov/llama.cpp/master/examples/json-schema-to-grammar.py
$ cat > movie-schema.json << EOF
{
  "type": "object",
  "required": ["title", "director", "releaseDate"],
  "properties": {
    "title": {
      "type": "string"
    },
    "director": {
      "type": "string"
    },
    "releaseDate": {
      "type": "string",
      "format": "date"
    },
    "genre": {
      "type": "string",
      "enum": ["Action", "Comedy", "Drama", "Science Fiction"]
    },
    "duration": {
      "type": "string"
    },
    "cast": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "additionalItems": false
    }
  }
}
EOF
$ cat > Modelfile << EOF
FROM deepseek-coder:33b-instruct-q6_K

TEMPLATE """{{ .System }}
### Instruction:
{{ .Prompt }}
### Response:
"""

SYSTEM """You are an AI developed by OpenAI. You process data and respond with JSON"""

PARAMETER grammar """
$(python3 json-schema-to-grammar.py movie-schema.json)
"""

PARAMETER num_ctx 16384
EOF
$ ollama create movie-info
...
$ pip install strip-tags
$ curl -sSL -A Chromium -s https://www.imdb.com/title/tt1375666 | strip-tags | ollama run movie-info 
{ "cast": ["Leonardo DiCaprio", "Joseph Gordon-Levitt", "Elliot Page"], "director": "Christopher 
Nolan", "duration": "2 hours 28 minutes", "genre": "Action", "releaseDate": "July 16, 2010 (United 
Kingdom)", "title": "Inception" }

clevcode · 2023-12-19T16:04:55Z

PS. The reason I'm telling deepseek-coder that it's an AI developed by OpenAI is because of this pretty hilarious result 😄

"Telling mixtral that it is "ChatGPT developed by OpenAI" boosts humaneval score by 6%"

https://www.reddit.com/r/MistralAI/comments/18lhila/telling_mixtral_that_it_is_chatgpt_developed_by/

Telling non-OpenAI models that they were developed by OpenAI might actually boost their performance, "self-belief" matters ;)

shroominic · 2023-12-23T10:03:31Z

LGTM!

Yu-Vitaqua-fer-Chronos · 2023-12-29T03:07:35Z

Would love for this to be merged, it'd be very, very useful to be able to get responses formatted in JSON following a specific format, or as people have said here, to confine it to a language's grammar reliably :p

nfsecurity · 2024-01-12T23:22:09Z

This would be awesome, particularly when you need a specific output like "yes or no".

DeNeutoy · 2024-01-24T20:37:03Z

This would enable a whole host of powerful applications if merged. Would love to see this get in!

jayfalls · 2024-03-16T21:24:44Z

Please can we get this merged, what still needs to happen?

ryanpeach · 2024-03-25T21:32:25Z

The diff is so small! Is this really all that is needed to enable it?

nperez · 2024-04-01T11:15:28Z

It would be awesome to get this merged but apparently it has some bitrot now. Maybe @clevcode can resolve those so we can make it trivial for maintainers to merge and renew our calls for it?

Just want to add my voice to the others wanting this merged in. Grammar constrained generation is a must for building reliable tools on top of LLMs, and the hard work has already been done by the GGML folks.

jayfalls · 2024-04-01T11:37:06Z

I'm willing to do it with the latest code, but @jquesnelle already did that over a month ago with #2404, and still no merge

No feedback from core team, so I'm wondering if they're just not seeing this pull in the ocean of pulls they have

nperez · 2024-04-01T12:33:22Z

Ooof, yeah, now that I look more closely, you're right that there are multiple PRs specifically for this feature.

oeway · 2024-04-04T20:08:39Z

+1 Please! This is a great feature to have!

…a#1606)

mishushakov · 2024-04-21T07:53:07Z

Would love to use this for my project

thiswillbeyourgithub · 2024-04-21T09:40:07Z

This is an awesome feature! Image what we could do if llama3 could output python list formats! I'm eagerly waiting for this to up my "auto labelling" workflows in various apps.

qkxie · 2024-04-23T05:23:37Z

three month has passed. this feature wasn't merged.

alexclaydon · 2024-04-29T12:17:00Z

We love Ollama and have been using it for months, but we can’t wait much longer for GBNF, particularly now with the launch of viable small models like Llama3:8b. Would be really helpful to hear something from the core team on the roadmap for this - it seems such a glaring omission. We’re on the verge of throwing in the towel and switching over to llama.cpp.

UmutAlihan · 2024-04-29T23:25:39Z

I am very looking forward to see support for GBNF so that we can force models proper json using ollama. I really want to utilize further ollama for such use cases as well.

tionis · 2024-04-30T11:11:28Z

Just a small note: This PR does not apply cleanly anymore. I did merge it just for testing purposes with some small modifications and the grammars didn't seem to apply correctly.
I did however only do two very quick tests, so maybe I misused the API.

trustedtomato · 2024-05-06T21:41:25Z

I applied this patch to an earlier version of ollama, and it works nicely:) the only problem might be, before it gets into upstream, is that an incorrect grammar crashes the server. It would be much nicer if the server handled the error, and just returned some error code with the error message coming from Llama.cpp.

jacopofar · 2024-05-07T14:07:09Z

@trustedtomato I think llamacpp does not return any error message, it just segfaults when it gets a wrong grammar file

trustedtomato · 2024-05-08T11:54:37Z

@trustedtomato I think llamacpp does not return any error message, it just segfaults when it gets a wrong grammar file

In the end something definitely segfaults, but I have an error message before that, which can be traced back to: https://github.com/ggerganov/llama.cpp/blob/1fd9c1741d864d01cd7ec6d67227b92d7bfabf22/common/grammar-parser.cpp#L299 and https://github.com/ggerganov/llama.cpp/blob/master/common/grammar-parser.cpp#L258
Here are the logs of ollama serve just before it crashes, showing an example of the fact that there is an error message:

{"function":"update_slots","level":"INFO","line":1636,"msg":"slot released","n_cache_tokens":1176,"n_ctx":2048,"n_past":1175,"n_system_tokens":0,"slot_id":0,"task_id":0,"tid":"127570538399424","timestamp":1715168697,"truncated":false}
parse: error parsing grammar: expecting ::= at := [0-9] | [1-9] [0-9]*
natintarray ::= "[" ws (natint ("," ws natint)*)? ws "]"
string ::=
  "\"" (
    [^"\\\x7F\x00-\x1F] |
    "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F]) # escapes
  )* "\"" ws
stringarray ::= "[" ws (string ("," ws string)*)? ws "]"
answerprefix ::= "{" ws "\"answer\":" ws
answerpostfix ::= ws "}"
root ::=  answerprefix natintarray answerpostfix
llama_sampling_init: failed to parse grammar
{"function":"launch_slot_with_data","level":"INFO","line":827,"msg":"slot is processing task","slot_id":0,"task_id":100,"tid":"127570538399424","timestamp":1715168697}
Segmentation fault (core dumped)

clevcode mentioned this pull request Dec 19, 2023

Grammar and Logits questions #1507

Open

Merge branch 'jmorganca:main' into main

7992692

lebrunel mentioned this pull request Jan 17, 2024

Add basic JSON Schema support to the API (converts to GBNF grammar) #830

Open

silvergasp mentioned this pull request Jan 26, 2024

vLLM compatibility ? TabbyML/tabby#795

Closed

jquesnelle mentioned this pull request Feb 8, 2024

Add GBNF grammar support #2404

Open

lebrunel mentioned this pull request Feb 13, 2024

Support Ollama thmsmlr/instructor_ex#11

Closed

mateusz mentioned this pull request Mar 3, 2024

Ollama API support mateusz/cherryberry#9

Open

markcda added a commit to alley-team/grammared-ollama that referenced this pull request Apr 12, 2024

Added grammar (and json schemas) support (from ollama/ollama PR ollam…

cca7ec2

…a#1606)

markcda mentioned this pull request Apr 12, 2024

Added grammar (and json schemas and CPU-only Dockerfile) support (from ollama/ollama PR #1606) #3618

Closed

grantCelley mentioned this pull request May 5, 2024

Grammar Guided response from model. #4074

Open

zutto mentioned this pull request Jun 10, 2024

Exposing grammar as a request parameter in completion/chat with go-side grammar validation #4525

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for specifying an arbitrary GBNF compatible grammar #1606

Added support for specifying an arbitrary GBNF compatible grammar #1606

clevcode commented Dec 19, 2023 •

edited

Loading

clevcode commented Dec 19, 2023 •

edited

Loading

clevcode commented Dec 19, 2023

clevcode commented Dec 19, 2023 •

edited

Loading

clevcode commented Dec 19, 2023

shroominic commented Dec 23, 2023 •

edited

Loading

Yu-Vitaqua-fer-Chronos commented Dec 29, 2023

nfsecurity commented Jan 12, 2024

DeNeutoy commented Jan 24, 2024

jayfalls commented Mar 16, 2024

ryanpeach commented Mar 25, 2024

nperez commented Apr 1, 2024

jayfalls commented Apr 1, 2024

nperez commented Apr 1, 2024

oeway commented Apr 4, 2024

mishushakov commented Apr 21, 2024

thiswillbeyourgithub commented Apr 21, 2024

qkxie commented Apr 23, 2024

alexclaydon commented Apr 29, 2024

UmutAlihan commented Apr 29, 2024

tionis commented Apr 30, 2024

trustedtomato commented May 6, 2024

jacopofar commented May 7, 2024

trustedtomato commented May 8, 2024

Added support for specifying an arbitrary GBNF compatible grammar #1606

Are you sure you want to change the base?

Added support for specifying an arbitrary GBNF compatible grammar #1606

Conversation

clevcode commented Dec 19, 2023 • edited Loading

clevcode commented Dec 19, 2023 • edited Loading

clevcode commented Dec 19, 2023

clevcode commented Dec 19, 2023 • edited Loading

clevcode commented Dec 19, 2023

shroominic commented Dec 23, 2023 • edited Loading

Yu-Vitaqua-fer-Chronos commented Dec 29, 2023

nfsecurity commented Jan 12, 2024

DeNeutoy commented Jan 24, 2024

jayfalls commented Mar 16, 2024

ryanpeach commented Mar 25, 2024

nperez commented Apr 1, 2024

jayfalls commented Apr 1, 2024

nperez commented Apr 1, 2024

oeway commented Apr 4, 2024

mishushakov commented Apr 21, 2024

thiswillbeyourgithub commented Apr 21, 2024

qkxie commented Apr 23, 2024

alexclaydon commented Apr 29, 2024

UmutAlihan commented Apr 29, 2024

tionis commented Apr 30, 2024

trustedtomato commented May 6, 2024

jacopofar commented May 7, 2024

trustedtomato commented May 8, 2024

clevcode commented Dec 19, 2023 •

edited

Loading

clevcode commented Dec 19, 2023 •

edited

Loading

clevcode commented Dec 19, 2023 •

edited

Loading

shroominic commented Dec 23, 2023 •

edited

Loading