Skip to content

feat(embedding): add Voyage AI embedding provider#67

Merged
jack-arturo merged 6 commits into
verygoodplugins:mainfrom
jescalan:feat/voyage-embedding-provider
Feb 16, 2026
Merged

feat(embedding): add Voyage AI embedding provider#67
jack-arturo merged 6 commits into
verygoodplugins:mainfrom
jescalan:feat/voyage-embedding-provider

Conversation

@jescalan
Copy link
Copy Markdown
Contributor

Summary

Add Voyage AI as an embedding provider option, supporting the Voyage 4 family (voyage-4, voyage-4-large, voyage-4-lite) with configurable dimensions.

Changes

  • New VoyageEmbeddingProvider class with batch support (128/batch), retry logic, and rate limit handling
  • Added voyage as explicit EMBEDDING_PROVIDER option
  • Voyage is first in auto-selection priority when VOYAGE_API_KEY is set
  • Configurable via VOYAGE_API_KEY, VOYAGE_MODEL env vars

Why Voyage?

  • High quality embeddings (approaches voyage-3-large quality)
  • Generous free tier
  • Shared embedding space across the voyage-4 family (can upgrade models without re-embedding)
  • Complements the existing OpenAI and Ollama providers

Add Voyage AI as an embedding provider option, supporting the Voyage 4
family (voyage-4, voyage-4-large, voyage-4-lite) with configurable
dimensions (256, 512, 1024, 2048).

- New VoyageEmbeddingProvider class with batch support and retry logic
- Added 'voyage' as explicit EMBEDDING_PROVIDER option
- Voyage is first in auto-selection priority (generous free tier, high quality)
- Configurable via VOYAGE_API_KEY, VOYAGE_MODEL env vars
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new Voyage embedding backend and integrates it into provider selection, initialization, and docs/tests. Introduces VoyageEmbeddingProvider (HTTP client, retries, validation), updates auto/explicit EMBEDDING_PROVIDER logic to prefer Voyage, and adds related docs, tests, and httpx dependency.

Changes

Cohort / File(s) Summary
App init / provider selection
app.py
Adds explicit voyage branch and updates auto selection to prefer Voyage → OpenAI → Ollama → fastembed → placeholder; ensures VECTOR_SIZE alignment, adjusts help/warning text and fallback semantics.
Voyage embedding implementation
automem/embedding/voyage.py
New VoyageEmbeddingProvider class: httpx client, request/response validation, retry/backoff, batch support, dimension checks, provider_name/close/context manager.
Tests
tests/test_embedding_providers.py
Extensive tests for Voyage provider: init headers/timeout, single/batch embeddings, response validation, retry/backoff behavior, and provider-selection scenarios (explicit and auto).
Dependency
requirements.txt
Adds httpx>=0.27.0 dependency for Voyage HTTP client usage.
Docs & install notes
INSTALLATION.md, README.md, docs/ENVIRONMENT_VARIABLES.md
Introduce VOYAGE_API_KEY / VOYAGE_MODEL env vars, document supported Voyage model dimensions and VECTOR_SIZE guidance, add voyage to EMBEDDING_PROVIDER options and update fallback/behavior descriptions.

Sequence Diagram(s)

sequenceDiagram
  participant App
  participant VoyageProvider as VoyageEmbeddingProvider
  participant VoyageAPI as Voyage API
  participant OpenAIProvider as OpenAI Provider
  participant Fallback as Other Providers

  App->>App: read EMBEDDING_PROVIDER, VOYAGE_API_KEY, OPENAI_API_KEY
  alt EMBEDDING_PROVIDER == "voyage" or auto prefers voyage
    App->>VoyageProvider: instantiate(api_key, model, dimension)
    VoyageProvider->>VoyageAPI: POST /v1/embeddings (inputs, model, output_dimension)
    VoyageAPI-->>VoyageProvider: 200 + embeddings
    VoyageProvider-->>App: initialized (provider_name)
  else Voyage init fails
    App->>OpenAIProvider: attempt instantiate (if key present)
    OpenAIProvider-->>App: initialized or fails
    alt all providers fail
      App->>Fallback: use fastembed/placeholder
      Fallback-->>App: provider initialized
    end
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (9 files):

⚔️ .env.example (content)
⚔️ CLAUDE.md (content)
⚔️ INSTALLATION.md (content)
⚔️ README.md (content)
⚔️ app.py (content)
⚔️ automem/embedding/openai.py (content)
⚔️ docs/ENVIRONMENT_VARIABLES.md (content)
⚔️ requirements.txt (content)
⚔️ tests/test_embedding_providers.py (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding Voyage AI as a new embedding provider to the system.
Description check ✅ Passed The description clearly relates to the changeset, explaining the Voyage provider implementation, configuration options, and rationale.
Docstring Coverage ✅ Passed Docstring coverage is 90.91% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Around line 109-121: The code currently assumes response.json() contains a
"data" list and indexes into data["data"], which raises an unhelpful KeyError on
unexpected API responses; update the method that calls response.json() and
builds embeddings (the block that assigns embeddings = [item["embedding"] for
item in data["data"]]) to first validate the JSON structure: verify that the
top-level object is a dict, contains a "data" key that is a list, and that each
item is a dict with an "embedding" key that is a list of numbers; if the API
returned an error shape (e.g., an "error" key) or the structure is invalid,
raise a clear, non-retryable exception (e.g., ValueError with a message
including response.text, response.status_code, self.model) instead of letting a
KeyError propagate so retry logic won't waste attempts on permanent failures;
keep the subsequent dimension check against self._dimension and include index i
and self.model in the error messages.
- Around line 123-137: The except block handling httpx.HTTPStatusError should
treat transient server errors (5xx) like 429 as retryable: in the except
httpx.HTTPStatusError as e handler, inspect e.response.status_code and if it's
429 or in the 500–599 range, set last_error = e, log the
rate-limited/temporary-server-error with logger.warning including
attempt/self.max_retries, perform exponential backoff sleep (use time.sleep(2 **
attempt)) and continue when attempt < self.max_retries; otherwise re-raise the
exception. Move the local import time out of the handler and add a module-level
import time at the top of the file. Ensure you reference the existing last_error
variable, self.max_retries, and logger in this change.
🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)

204-207: Consider adding an explicit close() method alongside __del__.

Relying solely on __del__ for resource cleanup is fragile — it's not guaranteed to run in all interpreter scenarios. An explicit close() method would allow deterministic cleanup and make the class usable as a context manager in the future.

Suggested addition
+    def close(self) -> None:
+        """Close the HTTP client."""
+        if hasattr(self, "client"):
+            self.client.close()
+
     def __del__(self) -> None:
         """Clean up HTTP client."""
-        if hasattr(self, 'client'):
-            self.client.close()
+        self.close()

Comment thread automem/embedding/voyage.py
…ovider

- Add defensive validation for API response structure before accessing data['data']
- Check for 'data' key existence and validate it's a list, raise ValueError instead of KeyError
- Retry 5xx server errors in addition to 429 rate limit errors
- Move import time to module level for better organization

Addresses CodeRabbit review comments.
@coderabbitai coderabbitai Bot added codex enhancement New feature or request labels Feb 10, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Line 118: The list comprehension that sets embeddings = [item["embedding"] for
item in data["data"]] can raise a KeyError if any item lacks "embedding"; update
the code that handles data["data"] to validate each item (e.g., in the loop or
via a safe comprehension) by checking item is a dict, contains the "embedding"
key and that it's the expected type/shape, then either collect valid embeddings
or raise a clear ValueError indicating which item/index is malformed; reference
the variable embeddings, the source data["data"], and the items in that iterable
(e.g., the item variable used in the comprehension) when making the checks.
- Around line 130-144: The two overly long log lines in the Voyage request retry
block should be split so no source line exceeds 100 chars: in the
httpx.HTTPStatusError handler adjust the logger.warning call that currently
references logger.warning("Voyage %s (status %d), attempt %d/%d", error_type,
e.response.status_code, attempt + 1, self.max_retries + 1) by breaking the
formatted message across multiple source lines (e.g., build the format string or
assign it to a short-named variable before calling logger.warning) while
preserving the same placeholders and arguments (error_type,
e.response.status_code, attempt + 1, self.max_retries + 1); do the same for the
generic Exception handler's logger.warning("Voyage request failed, attempt
%d/%d: %s", attempt + 1, self.max_retries + 1, e) so both logger.warning
invocations remain functionally identical but each source line stays under 100
characters (refer to httpx.HTTPStatusError, logger.warning,
e.response.status_code, attempt, and self.max_retries to locate the calls).
🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)

211-214: __del__ is not guaranteed to run; consider an explicit close() method.

Python's __del__ can be skipped during interpreter shutdown or in reference cycles. For long-lived providers this is usually fine, but exposing a close() method (and optionally implementing __enter__/__exit__) makes resource management explicit and testable.

Comment thread automem/embedding/voyage.py Outdated
Comment thread automem/embedding/voyage.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@automem/embedding/voyage.py`:
- Around line 105-164: The response-validation ValueError exceptions raised in
the embedding call (from the validation checks that inspect data["data"],
item["embedding"], and embedding lengths in the method that posts to
VOYAGE_API_URL using self.client.post and self._dimension) should not be
retried; add an explicit except ValueError handler before the broad except
Exception that immediately re-raises the ValueError (or otherwise lets it
propagate) so the deterministic structural/dimension errors are surfaced instead
of being retried by the retry loop using self.max_retries.
- Line 8: Add the production dependency for httpx by adding "httpx>=0.27.0" to
requirements.txt (not just requirements-dev.txt) so the import in
automem/embedding/voyage.py succeeds at runtime; locate the import statement in
the voyage.py module to confirm the provider is covered and update the main
requirements.txt to include the required version constraint.
- Around line 118-135: After collecting and validating each embedding in the
response inside voyage.py, add a check that len(embeddings) equals the number of
input texts sent (the expected count used by generate_embedding /
generate_embeddings_batch); if they differ, raise a ValueError that includes the
expected count, actual count, and model name to fail fast and avoid downstream
IndexError or corrupted mappings. Place this check immediately after the
"Validate dimensions" loop (i.e., after the embeddings list is built and
validated) and reference the same request input length variable used when
calling the API in the generate_embedding / generate_embeddings_batch flow.
🧹 Nitpick comments (3)
automem/embedding/voyage.py (3)

73-79: Line 74 exceeds 100-character limit.

The format string on line 74 is ~102 characters wide. Black would flag this.

Proposed fix
         logger.info(
-            "Voyage embedding provider initialized (model=%s, dimensions=%d, timeout=%.1fs, retries=%d)",
+            "Voyage embedding provider initialized "
+            "(model=%s, dimensions=%d, timeout=%.1fs, retries=%d)",
             model,
             dimension,
             timeout,
             max_retries,
         )

As per coding guidelines, "maintain line length of 100 characters (enforced by Black)".


229-232: __del__ is not guaranteed to run; consider a context-manager or atexit hook.

CPython's __del__ can be skipped during interpreter shutdown or if there are reference cycles. For a provider that lives the entire app lifetime this is usually fine, but if you want to be defensive, atexit.register(self.client.close) in __init__ is more reliable.


29-71: Consider making api_key a private attribute.

self.api_key is publicly accessible on every instance. Since it's only needed to construct the httpx.Client headers (line 68) and not referenced again, storing it as self._api_key (or not storing it at all) reduces the surface for accidental secret exposure (e.g., via serialization or debugging).

Comment thread automem/embedding/voyage.py
Comment thread automem/embedding/voyage.py
Comment thread automem/embedding/voyage.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
automem/embedding/voyage.py (1)

261-274: Consider guarding __del__ against interpreter shutdown errors.

During interpreter shutdown, httpx or self.client may already be finalized, causing __del__ to throw. A try/except is a common defensive pattern here.

Proposed fix
     def __del__(self):
         """Clean up HTTP client."""
-        self.close()
+        try:
+            self.close()
+        except Exception:
+            pass

# Conflicts:
#	INSTALLATION.md
#	docs/ENVIRONMENT_VARIABLES.md
@jack-arturo jack-arturo merged commit 1bafc44 into verygoodplugins:main Feb 16, 2026
7 checks passed
jack-arturo added a commit that referenced this pull request Feb 16, 2026
🤖 I have created a release *beep* *boop*
---


##
[0.11.0](v0.10.1...v0.11.0)
(2026-02-16)


### Features

* **api:** add GET /memory/&lt;id&gt; endpoint
([9625580](9625580))
* **api:** add GET /memory/&lt;id&gt; endpoint
([#68](#68))
([205bdbe](205bdbe))
* **embedding:** add Voyage AI embedding provider
([a2ca403](a2ca403))
* **embedding:** add Voyage AI embedding provider
([#67](#67))
([1bafc44](1bafc44))
* **embedding:** support OpenAI-compatible providers via OPENAI_BASE_URL
([#76](#76))
([383a798](383a798)),
closes [#70](#70)


### Bug Fixes

* add defensive validation and improve error handling in Voyage provider
([4c8433a](4c8433a))
* add httpx to prod deps, skip retry on ValueError, validate embedding
count
([95c1e9d](95c1e9d))
* add missing metadata parsing and access tracking to GET
/memory/&lt;id&gt;
([d115929](d115929))
* guard against bool values in numeric timestamp parsing
([2304c5c](2304c5c))
* **utils:** handle numeric timestamps and non-string values in time
parser
([0bb222b](0bb222b))
* **utils:** handle numeric timestamps and non-string values in time
parser ([#69](#69))
([2ba2931](2ba2931))
* validate embedding items and fix line length
([4c0d278](4c0d278))


### Documentation

* **api:** include 500 error for GET /memory/{id}
([4b5a603](4b5a603))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codex enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants