feat: add remote embedding via proxy with cache layer by samcm · Pull Request #79 · ethpandaops/panda

samcm · 2026-03-17T01:37:54Z

Move embedding computation from the local ONNX model to a remote API (OpenRouter) accessed through the proxy, with a cache layer to avoid redundant calls. The server checks embedding availability from the proxy's /datasources response and uses the remote embedder when available, falling back to local ONNX otherwise.

Extract Embedder interface from the concrete ONNX struct, add a RemoteEmbedder that calls the proxy's new /embed endpoint, and a generic cache package (memory + Redis) so the proxy can avoid redundant OpenRouter API calls. The server checks embedding availability from the proxy's /datasources response and uses the remote embedder when available, falling back to local ONNX otherwise.

Drop LocalEmbedder and the hugot/ONNX dependency entirely — embedding is handled by the proxy's remote service. Simplify searchruntime.Build to only use the remote embedder. Add tests: - pkg/cache: memory cache unit tests + Redis integration tests (testcontainers) - pkg/embedding: RemoteEmbedder tests with mocked proxy endpoint - pkg/proxy: EmbeddingService tests with mocked OpenRouter API - pkg/resource: rewrite EIP index test to use a stub embedder

RemoteEmbedder now sends hashes to /embed/check first, then only sends text for uncached items to /embed. Single-item embeds skip the check and go directly to /embed.

Cache: - InMemoryCache and RedisCache now accept a TTL at construction time - Embedding cache uses a 30-day TTL so orphaned entries expire - Expired in-memory entries are skipped on read (lazy expiry) Metrics (panda_proxy_embedding_*): - embedding_requests_total{status} — OpenRouter API call count - embedding_request_duration_seconds — API call latency - embedding_tokens_total{type} — prompt/total token consumption - embedding_cost_usd — cumulative estimated cost in USD - embedding_items_total{source} — cache_hit vs cache_miss counts - extractDatasourceType now recognizes /embed paths Config: - cost_per_token field on EmbeddingConfig (default $0.02/1M tokens)

When cost_per_token is not set in config, the EmbeddingService queries the API's /models endpoint at startup and extracts the prompt cost for the configured model. Falls back gracefully if the fetch fails.

samcm added 6 commits March 17, 2026 11:37

add /embed/check endpoint, two-phase embedding in RemoteEmbedder

8c566f7

RemoteEmbedder now sends hashes to /embed/check first, then only sends text for uncached items to /embed. Single-item embeds skip the check and go directly to /embed.

fetch embedding cost-per-token from OpenRouter /models endpoint

37aa06d

When cost_per_token is not set in config, the EmbeddingService queries the API's /models endpoint at startup and extracts the prompt cost for the configured model. Falls back gracefully if the fetch fails.

remove cost_per_token config field, always fetch from API

a2efb35

samcm merged commit 88b6dc6 into master Mar 17, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add remote embedding via proxy with cache layer#79

feat: add remote embedding via proxy with cache layer#79
samcm merged 6 commits intomasterfrom
jolly-cow-748

samcm commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

samcm commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant