-
Notifications
You must be signed in to change notification settings - Fork 3
feat(google): support streaming #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for Google Vertex AI provider to the AI Gateway, enabling integration with Google's Gemini models through the Vertex AI API.
Key changes:
- Adds Google Vertex AI provider implementation with OAuth2 authentication
- Integrates query string parameters into gateway request handling
- Updates proxy-vcr to support Google Vertex API recording/replay
- Adds test coverage for Google Vertex provider with streaming and non-streaming modes
Reviewed Changes
Copilot reviewed 26 out of 29 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| gateway/src/providers/google/index.ts | Implements Google Vertex provider with URL parsing, authentication, and request routing |
| gateway/src/providers/google/auth.ts | Adds OAuth2 JWT-based authentication for GCP service accounts |
| gateway/src/api/google.ts | Implements request/response extractors for Google Gemini API format |
| gateway/src/index.ts | Adds query string preservation in gateway routing |
| gateway/src/providers/default.ts | Extracts streaming detection logic into overridable method |
| proxy-vcr/proxy_vcr/main.py | Adds Google Vertex proxy endpoint with query parameter support and custom cassette naming |
| gateway/test/providers/google.spec.ts | Adds test cases for Google Vertex provider |
| gateway/test/setup.ts | Mocks GCP OAuth2 token endpoint for tests |
| pyproject.toml | Adds "uE" to codespell ignore list |
| proxy-vcr/pyproject.toml | Adds google-auth dependency |
| gateway/package.json | Adds @streamparser/json-whatwg dependency |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // The group includes regions with hyphen like "europe-west4" | ||
| const match = url.match(/^https:\/\/(.+?)-aiplatform\.googleapis\.com$/) | ||
| return match?.[1] ?? null | ||
| const match = url.match(/^https:\/\/([^-]+)-aiplatform\.googleapis\.com$/) |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex pattern ([^-]+) will not match regions with hyphens like 'europe-west4' or 'us-east1'. The pattern stops at the first hyphen, so 'https://europe-west4-aiplatform.googleapis.com' would only capture 'europe' instead of 'europe-west4'. The regex should be changed to ([^.]+) or a more specific pattern that matches valid GCP regions.
| const match = url.match(/^https:\/\/([^-]+)-aiplatform\.googleapis\.com$/) | |
| const match = url.match(/^https:\/\/([a-z0-9-]+)-aiplatform\.googleapis\.com$/) |
| response = await client.post(url, content=body, headers=headers) | ||
| elif request.url.path.startswith('/google-vertex'): | ||
| client = cast(httpx.AsyncClient, request.scope['state']['httpx_client']) | ||
| url = GOOGLE_BASE_URL + request.url.path[len('/google-vertex') :] + '?' + request.url.query |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This URL construction will create malformed URLs when request.url.query is empty, resulting in a trailing '?' (e.g., 'https://example.com/path?'). Use conditional concatenation: url = GOOGLE_BASE_URL + request.url.path[len('/google-vertex'):] + (f'?{request.url.query}' if request.url.query else '')
| url = GOOGLE_BASE_URL + request.url.path[len('/google-vertex') :] + '?' + request.url.query | |
| url = GOOGLE_BASE_URL + request.url.path[len('/google-vertex') :] + (f'?{request.url.query}' if request.url.query else '') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can u fix this in another PR? @copilot
| // TODO(Marcelo): This is likely to be wrong, since we are not summing the usage. | ||
| this.extractedResponse.usage = this.extractUsage(chunk) |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The TODO comment correctly identifies a bug: for streaming responses, usage from multiple chunks should be accumulated rather than overwritten. Each chunk may contain partial usage data that needs to be summed. Consider implementing accumulation logic similar to how token counts are typically aggregated across streaming chunks.
| // TODO(Marcelo): This is likely to be wrong, since we are not summing the usage. | |
| this.extractedResponse.usage = this.extractUsage(chunk) | |
| // Accumulate usage across streaming chunks. | |
| const newUsage = this.extractUsage(chunk); | |
| if (this.extractedResponse.usage) { | |
| // Sum numeric fields in usage objects. | |
| for (const key of Object.keys(newUsage)) { | |
| if ( | |
| typeof newUsage[key] === 'number' && | |
| typeof this.extractedResponse.usage[key] === 'number' | |
| ) { | |
| this.extractedResponse.usage[key] += newUsage[key]; | |
| } else { | |
| // For non-numeric fields, prefer the latest chunk's value. | |
| this.extractedResponse.usage[key] = newUsage[key]; | |
| } | |
| } | |
| } else { | |
| this.extractedResponse.usage = { ...newUsage }; | |
| } |
| client = cast(httpx.AsyncClient, request.scope['state']['httpx_client']) | ||
| url = GOOGLE_BASE_URL + request.url.path[len('/google-vertex') :] + '?' + request.url.query | ||
| headers = {'Authorization': auth_header, 'host': 'aiplatform.googleapis.com'} | ||
| # It's a bit weird, but if we don't set the host header, it will fail. This seems very weird from Google's side. |
Copilot
AI
Nov 5, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The comment could be more informative. Consider explaining why Google requires the host header (likely for virtual host routing or SNI validation) rather than just calling it 'weird'. For example: 'Google's API requires the exact host header for routing/validation, even when accessing via proxy.'
| # It's a bit weird, but if we don't set the host header, it will fail. This seems very weird from Google's side. | |
| # Google's API requires the exact host header for routing/validation (likely for virtual host routing or SNI validation), even when accessed via proxy. If we don't set it, requests will fail. |
No description provided.