Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yaml Grammar #871

Open
eitanturok opened this issue May 6, 2024 · 1 comment
Open

Yaml Grammar #871

eitanturok opened this issue May 6, 2024 · 1 comment
Labels
examples Linked to usage examples grammar structured generation Linked to structured generation

Comments

@eitanturok
Copy link
Contributor

eitanturok commented May 6, 2024

The LinkedIn Engineering Team recently wrote about their experience implementing tool use with LLMs. They explain that they structure every tool call in YAML, not JSON because YAML requires fewer tokens:

Since the parameters to the call have to match the input schema, we ask the LLM to output them in a structured manner. Most LLMs are trained on YAML and JSON for structured output. We picked YAML because it is less verbose, and hence consumes fewer tokens than JSON.

This seems to be a very important reason to support structured YAML generation.

I found the grammar for YAML here and would love for Outlines to implement this.

Thoughts?

@brandonwillard brandonwillard added examples Linked to usage examples structured generation Linked to structured generation grammar labels May 9, 2024
@brandonwillard
Copy link
Contributor

This should be possible with CFG-structured generation. There's a lark grammar for YAML here, but it might need to be changed in order to properly support any parser restrictions the current CFG-structured generation may have (e.g. LALR(1) only).

In the meantime, we can leave this open for people to try and report any issues or necessary changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples Linked to usage examples grammar structured generation Linked to structured generation
Projects
None yet
Development

No branches or pull requests

2 participants