Replies: 2 comments 4 replies
-
good point, we will address function-calling APIs soon. Our data so far shows that constrained grammars (like using function calling) is worse than using pure prompting (as also evidenced by the Berkeley Function calling leaderboard, where "prompt" beats "FC" nearly every time) see link which is why we haven't focused much on it. In fact, function-calling for OpenAI APIs doesn't yet support enum values in your schemas (as of a month ago), nor it supports easy chain-of-thought. I think in general users of BAML opt for BAML not only due to the easier type definitions (you can use string[] instead of making a wrapper List object), but also things like the playground preview, instant testing it enables, fuzzy json parsing, etc. In the future we might add a We'll revamp the comparison soon, and thanks for calling that out. We do want to make sure we're evaluating fairly. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the additional info! I recognize that my perspective is skewed by the peculiarities of my use case, and appreciate the additional context. I was unaware of the performance of prompt-only methods for generating reliable JSON! ... speaking of fuzzy JSON parsing, is that something that can be used separately from BAML (python), or is it tightly integrated? Since I'm particularly focused on guaranteeing that the generated JSON that matches the specified schema, it seems like the fuzzy parsing could be an easy win. (apologies for the discussion diversion here; I'd be happy to move to a separate discussion) |
Beta Was this translation helpful? Give feedback.
-
A lot of the complaints in the
Comparing Pydantic
workflow seem to stem from not usingfunction-calling
, which makes me feel like the comparison is less valid. Many providers allow the user to specify the return json schema that the called "function" would use.Using the OpenAI API as an example - instead of using "json mode", users can provide the expected return schema in the
tools
array, settool_choice
to specify the schema to be returned, and then provide roughly the same instructions in the prompt to help the model perform the work prior to returning json.This workflow requires a lot less regex to parse the returned JSON, though it doesn't fix the issues regarding juggling different prompts and/or jsonschemas.
Beta Was this translation helpful? Give feedback.
All reactions