-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GBNF grammar support #2404
base: main
Are you sure you want to change the base?
Add GBNF grammar support #2404
Conversation
Please add . this works and looks good to me. |
👍 reviewed and it looks good to me, would be really useful for the stuff I'm using ollama for locally. |
It is very important feature for most LLMs to be useful at real work applications that is not just free form chats. Can we have this merged? |
I really hope this can finally be merged. It's really odd to me that this simple but essential feature hasn't been merged yet, after multiple pull request submissions, and that the inferior "JSON mode" which is literally just a tiny subset of this (and quite useless, without being able to control the actual fields and field types) has been merged for a long time. Ollama is such a great project in many regards, but the apparent resistance against adding this feature is really puzzling to me. Anyway, I'll continue just running a custom fork until this is merged, if it ever is. |
PS. https://www.promptingguide.ai/research/llm-tokenization As pointed out by Karpathy, due to tokenization, YAML is a much better format than JSON for producing structured output from an LLM. Just one of many reasons for adding support for actually being able to specify a grammar, rather than just a format. Including support for some pre-defined formats is fine, but to actually be useful in real-world scenarios (including function calling), you also need to be able to define a schema along with the format, to control the field names and types. By adding the support for grammar constrained sampling schema support would be trivially added on top from the application level though, and right now when you're only able to specify a format, it's pretty much useless. Just specifying the schema in the prompt and "hoping and praying" that the output will adhere to it is a horribly unreliable workaround, and a waste of tokens requiring additional application level validation and bruteforce retry-logic. |
What is holding this pr from being merged? |
+1 grammar support will make Ollama so much more useful. |
Don't forget to update Ollama python client library, when add GBNF support in the server side.😊 |
This is an updated version of #1606 that accounts for changes to the code since it was originally submitted.
Adds support for llama.cpp's GBNF grammars, which enable very specific steering of model outputs. This feature is already used on the backend by when the
format
option is set tojson
, but this allows any arbitrary grammar to be passed in. In the case where bothgrammar
andformat
are specified, theformat
is preferred.