-
Notifications
You must be signed in to change notification settings - Fork 345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Google's Vertex AI #265
Conversation
Hey @flexchar - Thanks for this PR! Vertex was a frequently requested integration in community. We have released some changes today related to code formatting as well as changes in provider api config structure. Can you please update your branch with latest changes. Apart from the formatting changes, you will also have to make changes to |
6a9a1de
to
5d30f4c
Compare
It was so messy, I started out fresh and adapted the implementation. I've also introduced support for "safety_settings": [
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"threshold": "BLOCK_NONE"
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"threshold": "BLOCK_ONLY_HIGH"
}
] Embedding API is not set up since I don't have the payload/example to test on the spot. It can totally be added down the road. |
}; | ||
} | ||
|
||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can replace this with new functions which here released yesterday.
return generateInvalidProviderResponseError(response, VERTEX);
Please check some other provider integrations to get an idea about this.
return `/models/${model}:generateContent`; | ||
} | ||
case 'stream-chatComplete': { | ||
return `/models/${model}:streamGenerateContent`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vertex allows to send ?alt=sse
in url to get SSE instead of json stream. I think that will make things easy for us. Should we make that change?
Hey V, thanks for the review. Great catches. I'm having trouble getting streaming to work at all. This is the output I get:
It's extremely uninformative. Have you seen this before? I'm calling with |
This might be happening because by default, vertex sends a JSON stream. So you will have to add the provider here to convert the final response header to event stream. gateway/src/handlers/streamHandler.ts Lines 275 to 287 in 0454d41
|
Doesn't seem to change. I tried with I'm at a road block. I will try again later. If you would have time to try yourself, I'd appreciate. I pushed the other updates. EDIT: It gets a bit more hopeful if I use I cannot explain where this
|
If you are using sse then you will have to handle the |
Hey @flexchar - Just checking up on this. Are there any blockers that you are facing for this PR? |
Unfortunately I haven't had time to look since last messages. It's the streaming that I need to figure out. I plan to take a stab at it again this weekend. |
I'm back at it. I updated packages and began using Bun which has superior error logging. It turns out that the chunk from Vertex AI is actually a string of multiple chunks... {
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "This is a test"
}
]
}
}
]
}
,
{
"candidates": [
{
"content": {
"role": "model",
"parts": [
{
"text": "."
}
]
},
"finishReason": "STOP",
"safetyRatings": [
{
"category": "HARM_CATEGORY_HATE_SPEECH",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.08787644,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.124425635
},
{
"category": "HARM_CATEGORY_DANGEROUS_CONTENT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.07821887,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.050988145
},
{
"category": "HARM_CATEGORY_HARASSMENT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.17036992,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.08787644
},
{
"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
"probability": "NEGLIGIBLE",
"probabilityScore": 0.034358688,
"severity": "HARM_SEVERITY_NEGLIGIBLE",
"severityScore": 0.06681233
}
]
}
],
"usageMetadata": {
"promptTokenCount": 5,
"candidatesTokenCount": 5,
"totalTokenCount": 10
}
} This is why I would get a JSON.parse error, which previously was hidden in the way Wrangler/Node handles rejected promises (would be worth looking into it one day), in Expected a Response object
330 | chunk = chunk.trim();
331 | if (chunk === '[DONE]') {
332 | return `data: ${chunk}\n\n`;
333 | }
334 |
335 | let parsedChunk: GoogleGenerateContentResponse = JSON.parse(chunk);
^
SyntaxError: JSON Parse error: Unable to parse JSON string
at GoogleChatCompleteStreamChunkTransform (src/providers/google-vertex-ai/chatComplete.ts:335:52) So now I will have to find out who is responsible for passing the |
Got it! I tested the stream support using the official OpenAI library in TypeScript & Python. It's wild to think that I wouldn't have solved it without Bun's help, which I became a great fan of since last autumn. It helped me see the exact reply I got from GCP and how the That being sad, Bun closes the request too early and it never returned response leading me to debug an issue that was never there in the first place. Switching back to node/wrangler got me to the working stage. So it wouldn't be ready to replace node just yet. I would like to propose that in the future because we could add tests using bun as the test runner. There was one more catch. I updated all packages to the latest to be sure I'm not dealing with a stale issue and that turned out to be wise because Visarg, you were right regarding I also updated the TL:DR;
|
Hey @flexchar - Awesome! I have reviewed the PR and it LGTM. I have added one minor comment. Once you address it, I will merge the PR. And please also resolve fetch the latest changes from main branch and resolve the merge conflicts. Thanks! Comment: #265 (comment) |
Done 👍 |
Title:
Support for Vertex AI.
Vertex AI doesn't impose geolocation (based on IP) restrictions like Google AI Studio does thus allowing use from within Europe.
Example config to be used with this provider:
Note
Api key for google cloud is typically short lived (60 minutes) and can be retrieved through variety of ways including Client SDKs or using CLI
gcloud auth application-default print-access-token
.Motivation: (optional)
Related Issues: (optional)