Skip to content

v0.4.0 — Vertex AI backend & task-typed embeddings

Choose a tag to compare

@dratner dratner released this 18 May 02:14
· 12 commits to main since this release
cab919c

Google Vertex AI backend + task-typed embeddings (ADR-0009). The package now covers Morris's managed-AI posture — Claude and Gemini embeddings over Vertex/PSC, app-supplied auth, no static provider keys.

Added since v0.3.0

  • llms/providers/anthropic/anthropicvertex — Claude via Vertex AI as a separate leaf package so the base anthropic package stays Google-dependency-free; returns the same *anthropic.Client (all request/response/tool/cache translation + middleware reused).
  • anthropic.WithRequestOptions — low-level SDK escape hatch; API key optional when request options supply auth.
  • google.NewEmbeddings — Gemini/Vertex embeddings (gemini-embedding-001), order/ID-preserving, per-request dimension override.
  • llms.EmbeddingTask + EmbeddingInput.Title — provider-neutral, advisory; honored on Gemini, ignored on OpenAI (same pattern as CacheBreakpoint).
  • App-supplied credentials + PSC endpoint/transport injection; no ADC discovery. The auth-vs-PSC-transport precedence is documented and covered by option-order tests.
  • Fail-closed truncation: genai cannot send autoTruncate:false, so AutoTruncate=true is Vertex-only and AutoTruncate=false requires a client-side MaxInputBytes guard — NewEmbeddings fails rather than look safe while Vertex silently truncates.
  • gemini-embedding-001 is single-input: a multi-input request returns a typed bad_request — no fan-out, no hidden chunking exception (the app owns chunking).

Decisions of record

ADR-0009 (Vertex/PSC/Gemini-embeddings): the leaf-package shape, app-supplied-auth-only, the precedence rule, and the auto-truncate fail-closed refinement.

Out of scope (Morris / OpenTofu infrastructure)

Vertex API enablement, aiplatform IAM, PSC/restricted-API DNS, VPC-SC perimeter, egress lockdown, model/region config. Streaming and vLLM remain deliberately deferred.

Pre-1.0: v0.x minor versions may break.