Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GBNF grammar support #2404

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

jquesnelle
Copy link

This is an updated version of #1606 that accounts for changes to the code since it was originally submitted.

Adds support for llama.cpp's GBNF grammars, which enable very specific steering of model outputs. This feature is already used on the backend by when the format option is set to json, but this allows any arbitrary grammar to be passed in. In the case where both grammar and format are specified, the format is preferred.

@v3ss0n
Copy link

v3ss0n commented Feb 18, 2024

Please add . this works and looks good to me.

@retrohacker
Copy link

👍 reviewed and it looks good to me, would be really useful for the stuff I'm using ollama for locally.

@v3ss0n
Copy link

v3ss0n commented Feb 25, 2024

It is very important feature for most LLMs to be useful at real work applications that is not just free form chats.
Otherwise , Ollama is only good for testing purposes and for production we should just use llamacpp (since now they also has OpenAI-Compatible API) .

Can we have this merged?

@clevcode
Copy link

clevcode commented Mar 9, 2024

I really hope this can finally be merged.

It's really odd to me that this simple but essential feature hasn't been merged yet, after multiple pull request submissions, and that the inferior "JSON mode" which is literally just a tiny subset of this (and quite useless, without being able to control the actual fields and field types) has been merged for a long time.

Ollama is such a great project in many regards, but the apparent resistance against adding this feature is really puzzling to me. Anyway, I'll continue just running a custom fork until this is merged, if it ever is.

@clevcode
Copy link

clevcode commented Mar 9, 2024

PS. https://www.promptingguide.ai/research/llm-tokenization

As pointed out by Karpathy, due to tokenization, YAML is a much better format than JSON for producing structured output from an LLM. Just one of many reasons for adding support for actually being able to specify a grammar, rather than just a format.

Including support for some pre-defined formats is fine, but to actually be useful in real-world scenarios (including function calling), you also need to be able to define a schema along with the format, to control the field names and types.

By adding the support for grammar constrained sampling schema support would be trivially added on top from the application level though, and right now when you're only able to specify a format, it's pretty much useless. Just specifying the schema in the prompt and "hoping and praying" that the output will adhere to it is a horribly unreliable workaround, and a waste of tokens requiring additional application level validation and bruteforce retry-logic.

@Pytness
Copy link

Pytness commented Mar 15, 2024

What is holding this pr from being merged?

@lirc572
Copy link

lirc572 commented Mar 16, 2024

+1 grammar support will make Ollama so much more useful.

@svjack
Copy link

svjack commented Apr 30, 2024

Don't forget to update Ollama python client library, when add GBNF support in the server side.😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants