feat(embedding): add Voyage AI embedding provider#67
Conversation
Add Voyage AI as an embedding provider option, supporting the Voyage 4 family (voyage-4, voyage-4-large, voyage-4-lite) with configurable dimensions (256, 512, 1024, 2048). - New VoyageEmbeddingProvider class with batch support and retry logic - Added 'voyage' as explicit EMBEDDING_PROVIDER option - Voyage is first in auto-selection priority (generous free tier, high quality) - Configurable via VOYAGE_API_KEY, VOYAGE_MODEL env vars
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new Voyage embedding backend and integrates it into provider selection, initialization, and docs/tests. Introduces Changes
Sequence Diagram(s)sequenceDiagram
participant App
participant VoyageProvider as VoyageEmbeddingProvider
participant VoyageAPI as Voyage API
participant OpenAIProvider as OpenAI Provider
participant Fallback as Other Providers
App->>App: read EMBEDDING_PROVIDER, VOYAGE_API_KEY, OPENAI_API_KEY
alt EMBEDDING_PROVIDER == "voyage" or auto prefers voyage
App->>VoyageProvider: instantiate(api_key, model, dimension)
VoyageProvider->>VoyageAPI: POST /v1/embeddings (inputs, model, output_dimension)
VoyageAPI-->>VoyageProvider: 200 + embeddings
VoyageProvider-->>App: initialized (provider_name)
else Voyage init fails
App->>OpenAIProvider: attempt instantiate (if key present)
OpenAIProvider-->>App: initialized or fails
alt all providers fail
App->>Fallback: use fastembed/placeholder
Fallback-->>App: provider initialized
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Around line 109-121: The code currently assumes response.json() contains a
"data" list and indexes into data["data"], which raises an unhelpful KeyError on
unexpected API responses; update the method that calls response.json() and
builds embeddings (the block that assigns embeddings = [item["embedding"] for
item in data["data"]]) to first validate the JSON structure: verify that the
top-level object is a dict, contains a "data" key that is a list, and that each
item is a dict with an "embedding" key that is a list of numbers; if the API
returned an error shape (e.g., an "error" key) or the structure is invalid,
raise a clear, non-retryable exception (e.g., ValueError with a message
including response.text, response.status_code, self.model) instead of letting a
KeyError propagate so retry logic won't waste attempts on permanent failures;
keep the subsequent dimension check against self._dimension and include index i
and self.model in the error messages.
- Around line 123-137: The except block handling httpx.HTTPStatusError should
treat transient server errors (5xx) like 429 as retryable: in the except
httpx.HTTPStatusError as e handler, inspect e.response.status_code and if it's
429 or in the 500–599 range, set last_error = e, log the
rate-limited/temporary-server-error with logger.warning including
attempt/self.max_retries, perform exponential backoff sleep (use time.sleep(2 **
attempt)) and continue when attempt < self.max_retries; otherwise re-raise the
exception. Move the local import time out of the handler and add a module-level
import time at the top of the file. Ensure you reference the existing last_error
variable, self.max_retries, and logger in this change.
🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)
204-207: Consider adding an explicitclose()method alongside__del__.Relying solely on
__del__for resource cleanup is fragile — it's not guaranteed to run in all interpreter scenarios. An explicitclose()method would allow deterministic cleanup and make the class usable as a context manager in the future.Suggested addition
+ def close(self) -> None: + """Close the HTTP client.""" + if hasattr(self, "client"): + self.client.close() + def __del__(self) -> None: """Clean up HTTP client.""" - if hasattr(self, 'client'): - self.client.close() + self.close()
…ovider - Add defensive validation for API response structure before accessing data['data'] - Check for 'data' key existence and validate it's a list, raise ValueError instead of KeyError - Retry 5xx server errors in addition to 429 rate limit errors - Move import time to module level for better organization Addresses CodeRabbit review comments.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Line 118: The list comprehension that sets embeddings = [item["embedding"] for
item in data["data"]] can raise a KeyError if any item lacks "embedding"; update
the code that handles data["data"] to validate each item (e.g., in the loop or
via a safe comprehension) by checking item is a dict, contains the "embedding"
key and that it's the expected type/shape, then either collect valid embeddings
or raise a clear ValueError indicating which item/index is malformed; reference
the variable embeddings, the source data["data"], and the items in that iterable
(e.g., the item variable used in the comprehension) when making the checks.
- Around line 130-144: The two overly long log lines in the Voyage request retry
block should be split so no source line exceeds 100 chars: in the
httpx.HTTPStatusError handler adjust the logger.warning call that currently
references logger.warning("Voyage %s (status %d), attempt %d/%d", error_type,
e.response.status_code, attempt + 1, self.max_retries + 1) by breaking the
formatted message across multiple source lines (e.g., build the format string or
assign it to a short-named variable before calling logger.warning) while
preserving the same placeholders and arguments (error_type,
e.response.status_code, attempt + 1, self.max_retries + 1); do the same for the
generic Exception handler's logger.warning("Voyage request failed, attempt
%d/%d: %s", attempt + 1, self.max_retries + 1, e) so both logger.warning
invocations remain functionally identical but each source line stays under 100
characters (refer to httpx.HTTPStatusError, logger.warning,
e.response.status_code, attempt, and self.max_retries to locate the calls).
🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)
211-214:__del__is not guaranteed to run; consider an explicitclose()method.Python's
__del__can be skipped during interpreter shutdown or in reference cycles. For long-lived providers this is usually fine, but exposing aclose()method (and optionally implementing__enter__/__exit__) makes resource management explicit and testable.
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Around line 105-164: The response-validation ValueError exceptions raised in
the embedding call (from the validation checks that inspect data["data"],
item["embedding"], and embedding lengths in the method that posts to
VOYAGE_API_URL using self.client.post and self._dimension) should not be
retried; add an explicit except ValueError handler before the broad except
Exception that immediately re-raises the ValueError (or otherwise lets it
propagate) so the deterministic structural/dimension errors are surfaced instead
of being retried by the retry loop using self.max_retries.
- Line 8: Add the production dependency for httpx by adding "httpx>=0.27.0" to
requirements.txt (not just requirements-dev.txt) so the import in
automem/embedding/voyage.py succeeds at runtime; locate the import statement in
the voyage.py module to confirm the provider is covered and update the main
requirements.txt to include the required version constraint.
- Around line 118-135: After collecting and validating each embedding in the
response inside voyage.py, add a check that len(embeddings) equals the number of
input texts sent (the expected count used by generate_embedding /
generate_embeddings_batch); if they differ, raise a ValueError that includes the
expected count, actual count, and model name to fail fast and avoid downstream
IndexError or corrupted mappings. Place this check immediately after the
"Validate dimensions" loop (i.e., after the embeddings list is built and
validated) and reference the same request input length variable used when
calling the API in the generate_embedding / generate_embeddings_batch flow.
🧹 Nitpick comments (3)
automem/embedding/voyage.py (3)
73-79: Line 74 exceeds 100-character limit.The format string on line 74 is ~102 characters wide. Black would flag this.
Proposed fix
logger.info( - "Voyage embedding provider initialized (model=%s, dimensions=%d, timeout=%.1fs, retries=%d)", + "Voyage embedding provider initialized " + "(model=%s, dimensions=%d, timeout=%.1fs, retries=%d)", model, dimension, timeout, max_retries, )As per coding guidelines, "maintain line length of 100 characters (enforced by Black)".
229-232:__del__is not guaranteed to run; consider a context-manager oratexithook.CPython's
__del__can be skipped during interpreter shutdown or if there are reference cycles. For a provider that lives the entire app lifetime this is usually fine, but if you want to be defensive,atexit.register(self.client.close)in__init__is more reliable.
29-71: Consider makingapi_keya private attribute.
self.api_keyis publicly accessible on every instance. Since it's only needed to construct thehttpx.Clientheaders (line 68) and not referenced again, storing it asself._api_key(or not storing it at all) reduces the surface for accidental secret exposure (e.g., via serialization or debugging).
There was a problem hiding this comment.
🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)
261-274: Consider guarding__del__against interpreter shutdown errors.During interpreter shutdown,
httpxorself.clientmay already be finalized, causing__del__to throw. Atry/exceptis a common defensive pattern here.Proposed fix
def __del__(self): """Clean up HTTP client.""" - self.close() + try: + self.close() + except Exception: + pass
# Conflicts: # INSTALLATION.md # docs/ENVIRONMENT_VARIABLES.md
🤖 I have created a release *beep* *boop* --- ## [0.11.0](v0.10.1...v0.11.0) (2026-02-16) ### Features * **api:** add GET /memory/<id> endpoint ([9625580](9625580)) * **api:** add GET /memory/<id> endpoint ([#68](#68)) ([205bdbe](205bdbe)) * **embedding:** add Voyage AI embedding provider ([a2ca403](a2ca403)) * **embedding:** add Voyage AI embedding provider ([#67](#67)) ([1bafc44](1bafc44)) * **embedding:** support OpenAI-compatible providers via OPENAI_BASE_URL ([#76](#76)) ([383a798](383a798)), closes [#70](#70) ### Bug Fixes * add defensive validation and improve error handling in Voyage provider ([4c8433a](4c8433a)) * add httpx to prod deps, skip retry on ValueError, validate embedding count ([95c1e9d](95c1e9d)) * add missing metadata parsing and access tracking to GET /memory/<id> ([d115929](d115929)) * guard against bool values in numeric timestamp parsing ([2304c5c](2304c5c)) * **utils:** handle numeric timestamps and non-string values in time parser ([0bb222b](0bb222b)) * **utils:** handle numeric timestamps and non-string values in time parser ([#69](#69)) ([2ba2931](2ba2931)) * validate embedding items and fix line length ([4c0d278](4c0d278)) ### Documentation * **api:** include 500 error for GET /memory/{id} ([4b5a603](4b5a603)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Summary
Add Voyage AI as an embedding provider option, supporting the Voyage 4 family (voyage-4, voyage-4-large, voyage-4-lite) with configurable dimensions.
Changes
VoyageEmbeddingProviderclass with batch support (128/batch), retry logic, and rate limit handlingvoyageas explicitEMBEDDING_PROVIDERoptionVOYAGE_API_KEYis setVOYAGE_API_KEY,VOYAGE_MODELenv varsWhy Voyage?