-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
identify more robust output format for data synthesis #29
Comments
i've a few ideas for this i'll try. |
XML is the key. Its really the only key. Foundation LLMs have a lot of XML in their outputs so are super-primed to output it. Also you can scrub inputs by ensuring the tags are unlikely to be guessed. Also this is what LMQL is for! |
Good points. I’ll dig into this direction |
I do want to add regex and lmql to our platform btwBest,Matt==================Dr. Matthew Mirmanhttps://www.mirman.comOn 30 Jun 2023, at 7:15 PM, Carter Tazio Schonwald ***@***.***> wrote:
Good points. I’ll dig into this direction
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
should be a bit more robust, we can revisit this later |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
the first version uses JSON, which can often be malformed, and theres no good error recovery in that case, need to identify and switch to a more "error tolerant" self aligning format. (meaning we can skip a bad pair and recover useful outputs)
The text was updated successfully, but these errors were encountered: