Switch Vertex AI provider to native genai SDK by cdbartholomew · Pull Request #242 · vectorize-io/hindsight

cdbartholomew · 2026-01-30T01:19:23Z

Summary

Replace the OpenAI-compatible endpoint with the native google-genai SDK (genai.Client(vertexai=True)) for the vertexai LLM provider
Delete vertexai_token_refresher.py — the SDK handles credential refresh internally
Remove the 8192 max output token limitation that the OpenAI-compatible endpoint enforced
Strip markdown code fences in consolidation JSON parsing (Flash Lite wraps JSON in ```json blocks)

Motivation

The vertexai provider originally used the Vertex AI OpenAI-compatible endpoint to reuse the AsyncOpenAI client code path. This required a custom TokenInjectingTransport, a background token refresher with async lifecycle management, and hit an undocumented 8192 max output token cap on the endpoint. The native genai SDK (already a dependency for the gemini provider) handles auth automatically and doesn't have the output token cap.

Changes

llm_wrapper.py: vertexai now creates genai.Client(vertexai=True, project=..., location=...) and routes through _call_gemini/_call_with_tools_gemini. Service account key auth preserved via credentials parameter. Model names with google/ prefix are auto-stripped.
vertexai_token_refresher.py: Deleted (no longer needed)
consolidator.py: Strip ``` code fences before JSON parsing
test_vertexai_provider.py: Rewritten for native SDK (mock genai.Client instead of token refresher)

Test plan

pytest tests/test_vertexai_provider.py — all 6 pass, 1 skipped (integration)
Local end-to-end: retain + recall + consolidation working with vertexai provider
Verified output tokens are no longer capped at 8192

Replace the OpenAI-compatible endpoint approach with the native google-genai SDK for Vertex AI. This eliminates the custom token refresher, TokenInjectingTransport, and async lifecycle complexity while also removing the 8192 output token cap that the OpenAI endpoint enforced. Changes: - vertexai provider now uses genai.Client(vertexai=True) instead of AsyncOpenAI with token-injecting transport - Routes through existing _call_gemini/_call_with_tools_gemini paths - Strips google/ prefix from model names (native SDK uses bare names) - Preserves service account key auth via credentials parameter - Delete vertexai_token_refresher.py (no longer needed) - Strip markdown code fences in consolidator JSON parsing - Rewrite vertexai tests for native SDK integration

cdbartholomew requested a review from nicoloboschi January 30, 2026 01:20

nicoloboschi approved these changes Jan 30, 2026

View reviewed changes

nicoloboschi merged commit 49ae55a into main Jan 30, 2026
23 of 26 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch Vertex AI provider to native genai SDK#242

Switch Vertex AI provider to native genai SDK#242
nicoloboschi merged 1 commit into
mainfrom
feat/vertexai-native-sdk

cdbartholomew commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cdbartholomew commented Jan 30, 2026

Summary

Motivation

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants