-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added support for specifying an arbitrary GBNF compatible grammar #1606
base: main
Are you sure you want to change the base?
Conversation
in the Modelfile, for models running on the llama.cpp backend Note that this is basically just the same PR as the one submitted by SyrupThinker in September (ollama#565), and that has been mentioned in issue ollama#1507 and ollama#808 since then. There are plenty of users that would appreciate this feature, so I really hope that it can get merged. It's great that support for JSON grammar specifically has been added, by setting the GBNF grammar in question when JSON format is requested, but by providing the user with the ability to specify an arbitrary grammar opens up for a lot more possibilities than that Pull request ollama#830 adds support for specifying JSON schemas, which is yet another great convenience feature for a specific and common usecase, but by adding support for arbitrary GBNF grammar it would be possible to have any model outputting data in any type of format, including custom DSLs and text-based file formats in general This is a tremendously useful thing to have when building various types of automation related applications, so I really hope that this can get merged to avoid having to maintain separate forks. Ollama is a great project, let's keep making it even better
A really simple Modelfile example, to ensure that a model only answers with a Python code block 😄 FROM deepseek-coder |
Theoretically it would even be possible to enforce that the actual code produced is syntactically valid Python code. Adding a good SYSTEM prompt helps a lot of course. Here's an example of it in action:
|
Another example:
|
PS. The reason I'm telling deepseek-coder that it's an AI developed by OpenAI is because of this pretty hilarious result 😄 "Telling mixtral that it is "ChatGPT developed by OpenAI" boosts humaneval score by 6%" https://www.reddit.com/r/MistralAI/comments/18lhila/telling_mixtral_that_it_is_chatgpt_developed_by/ Telling non-OpenAI models that they were developed by OpenAI might actually boost their performance, "self-belief" matters ;) |
LGTM! |
Would love for this to be merged, it'd be very, very useful to be able to get responses formatted in JSON following a specific format, or as people have said here, to confine it to a language's grammar reliably :p |
This would be awesome, particularly when you need a specific output like "yes or no". |
This would enable a whole host of powerful applications if merged. Would love to see this get in! |
Please can we get this merged, what still needs to happen? |
The diff is so small! Is this really all that is needed to enable it? |
It would be awesome to get this merged but apparently it has some bitrot now. Maybe @clevcode can resolve those so we can make it trivial for maintainers to merge and renew our calls for it? Just want to add my voice to the others wanting this merged in. Grammar constrained generation is a must for building reliable tools on top of LLMs, and the hard work has already been done by the GGML folks. |
I'm willing to do it with the latest code, but @jquesnelle already did that over a month ago with #2404, and still no merge No feedback from core team, so I'm wondering if they're just not seeing this pull in the ocean of pulls they have |
+1 Please! This is a great feature to have! |
Would love to use this for my project |
This is an awesome feature! Image what we could do if llama3 could output python list formats! I'm eagerly waiting for this to up my "auto labelling" workflows in various apps. |
three month has passed. this feature wasn't merged. |
We love Ollama and have been using it for months, but we can’t wait much longer for GBNF, particularly now with the launch of viable small models like Llama3:8b. Would be really helpful to hear something from the core team on the roadmap for this - it seems such a glaring omission. We’re on the verge of throwing in the towel and switching over to llama.cpp. |
I am very looking forward to see support for GBNF so that we can force models proper json using ollama. I really want to utilize further ollama for such use cases as well. |
Just a small note: This PR does not apply cleanly anymore. I did merge it just for testing purposes with some small modifications and the grammars didn't seem to apply correctly. |
I applied this patch to an earlier version of ollama, and it works nicely:) the only problem might be, before it gets into upstream, is that an incorrect grammar crashes the server. It would be much nicer if the server handled the error, and just returned some error code with the error message coming from Llama.cpp. |
@trustedtomato I think llamacpp does not return any error message, it just segfaults when it gets a wrong grammar file |
In the end something definitely segfaults, but I have an error message before that, which can be traced back to: https://github.com/ggerganov/llama.cpp/blob/1fd9c1741d864d01cd7ec6d67227b92d7bfabf22/common/grammar-parser.cpp#L299 and https://github.com/ggerganov/llama.cpp/blob/master/common/grammar-parser.cpp#L258
|
in the Modelfile, for models running on the llama.cpp backend
Note that this is basically just the same PR as the one submitted by SyrupThinker in September (#565), and that has been mentioned in issue #1507 and #808 since then.
There are plenty of users that would appreciate this feature, so I really hope that it can get merged.
It's great that support for JSON grammar specifically has been added, by setting the GBNF grammar in question when JSON format is requested, but providing the user with the ability to specify an arbitrary grammar opens up for a lot more possibilities than that
Pull request #830 adds support for specifying JSON schemas, which is yet another great convenience feature for a specific and common usecase, but by adding support for arbitrary GBNF grammar it would be possible to have any model outputting data in any type of format, including custom DSLs and text-based file formats in general
This is a tremendously useful thing to have when building various types of automation related applications, so I really hope that this can get merged to avoid having to maintain separate forks. Ollama is a great project, let's keep making it even better