-
Notifications
You must be signed in to change notification settings - Fork 182
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON mode in googleai
plugin for Gemini 1.5 pro+flash
#389
base: main
Are you sure you want to change the base?
Conversation
This partially addresses #290 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice to see you taking this one on! Thanks!
I will defer to @mbleigh for a more thorough review, but I left some small comments in the mean time.
One thing I think we need to consider is, how this should interact with (1) the existing output coercion and (2) conformance that happens via middleware. I'm thinking we'll want to ensure this is set to application/json
in cases where we already know the user wants json (i.e. when output?.format
or output?.schema
is set). Is there value in allowing the end-user to set this independently? Does it conflict w/ the existing output format field that we have?
js/plugins/googleai/src/gemini.ts
Outdated
topK: request.config?.topK, | ||
topP: request.config?.topP, | ||
stopSequences: request.config?.stopSequences, | ||
responseMimeType: request.config?.responseMimeType, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple quick thoughts.
- You'll want to add any new fields to the
GeminiConfigSchema
above for proper validation. - What happens when apiVersion is set to
v1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll want to add any new fields to the GeminiConfigSchema above for proper validation.
Sure thing, I'll have a look.
What happens when apiVersion is set to v1?
If you mean "literally what happens now if you pass through the responseMimeType
config param?", the Gemini service returns a 400
with a clear error message. This gets thrown all the way out of genkit
.
GoogleGenerativeAIFetchError: [GoogleGenerativeAI Error]: Error fetching from https://generativelanguage.googleapis.com/v1beta/models/gemini-pro:generateContent: [400 Bad Request] Json mode is not enabled for models/gemini-pro
Is this adequate or should there be up-front handling? (I considered the v1
/ v1beta
difference but as there isn't currently any other differentiation in the model implementation, I thought I should go with the flow and not make the distinction.)
Hi @njpearman thanks for the great start! I probably should have given you some pointers before sending you off to write the PR, but a few high-level comments:
Does that make sense? |
@mbleigh, @MichaelDoyle thanks for your feedback. All of this makes sense in principal although I need to look into the data transformation part in more detail to see what that means. I'm hoping there are enough existing examples around the One question: for clarity, does this mean that the I've been digging around the output schema stuff a bit separately. My understanding is that if the (Aside: I have some thoughts on the JSON validation as I got burnt by the strictness yesterday. I feel there needs to be some sort of "lax" / "no validation" mode given LLM wobbliness at conforming to a complex JSON schema. I haven't seen if this already exists nor looked to see if there's an existing GH issue though.) |
Hey there, and sorry to crash your PR! I've spent a lot of time with the Genkit beta lately and came across Issue #290 when looking for documentation on JSON validation. I just wanted to echo @njpearman's sentiment about getting burnt by strictness and how useful a "lax" mode would be. So far, I've found that most of the effort of using Genkit involves writing try-catch statements to wrap requests with fallbacks because I get so many "FAILED_PRECONDITION: Generation resulted in no candidates matching provided output schema" errors. I really love working with Genkit overall but I'm probably reinventing the wheel here by essentially building my own lax mode. Let me know if you ever want any user feedback and thanks for the awesome library! 👍 |
@ariel-pettyjohn I've opened this discussion so we can move the validation convo there: https://github.com/firebase/genkit/discussions/497. I also agree that genkit is great! I'm keen to help push it forward. |
input: | ||
schema: | ||
food: string | ||
output: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we define the output schema here as opposed to in the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this prompt, there's no output
schema defined either here or in the code; it is just specifying that the output is JSON mode. The LLM will (should!) return an adhoc JSON structure rather than a pre-defined one.
Hey there, also coming in late to this thread, I am one of the Genkit devs and I took at a look at the PR. I left one small comment, but it looks pretty much good to go now! |
@huangjeff5 thanks for the review and feedback. I've replied to your comment - let me know if you'd like me to do anything to help make that test clearer! |
@huangjeff5 anything further work needed here? |
Hey just catching up on this thread after getting back from vacation. JSON mode was merged already in a different PR, I see some of these changes in the codebase already. However, the testapp is still useful! Think we may need to rebase the PR though since there are some merge conflicts (or just make a new PR to add the testapp). |
Checklist (if applicable):
I haven't found the changelog. is there one? I also haven't documented as there doesn't seem to be anywhere for detailed Gemini config documentation.
This upgrades the
@google/generative-ai
package in thegoogleai
plugin to allow support of JSON mode with Gemini 1.5. There are minimal changes to the code - I pulled out a variable to allow for a more strongly-typed config value.I wondered if a
json
method would be worthwhile on theGenerateResponse
implementation but it seemed unnecessary seeing asoutput
already does the job.Note that this doesn't add JSON mode to the
vertexai
plugin, as the vertex package doesn't yet support theresponse_mime_type
/responseMimeType
configuration property (see googleapis/nodejs-vertexai#306).I've tested this against both pro and flash using
testapps/json-mode/index.ts
, and get the following output. (JSON mode looks good, the jokes not so much.)