Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for json (or other?) grammar? #1945

Open
kurtbuilds opened this issue Mar 27, 2024 · 1 comment
Open

support for json (or other?) grammar? #1945

kurtbuilds opened this issue Mar 27, 2024 · 1 comment

Comments

@kurtbuilds
Copy link

kurtbuilds commented Mar 27, 2024

llama.cpp now supports grammars:

https://til.simonwillison.net/llms/llama-cpp-python-grammars

Is that something that will come to candle?

It sounds like the approach taken in this python library would be straight forward:

https://github.com/1rgs/jsonformer/blob/main/jsonformer/main.py

Basically, since you know the JSON schema, you return appropriate LLM tokens for structure based on control flow, and constrain logit output for typed value situations.

I started to work on this approach in a demo codebase... I'll report back on any progress.

Curious to hear from others about how feasible the approach is.

@ealmloff
Copy link
Contributor

👋 I wrote a implementation of constrained sampling with candle here that might be useful as a reference. Here are a few things I found important:

  • Parsing must be incremental if you want to get reasonable speeds for longer sequences (This makes FSM a good choice)
  • You can accelerate text generation by eagerly sampling the grammar and feeding the required next tokens into the LLM in one batch instead of one token at a time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants