Add real TEI embedding client for semantic tool discovery#3844
Add real TEI embedding client for semantic tool discovery#3844aponcedeleonch wants to merge 1 commit intomainfrom
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #3844 +/- ##
==========================================
- Coverage 66.75% 66.73% -0.03%
==========================================
Files 444 445 +1
Lines 44065 44130 +65
==========================================
+ Hits 29415 29448 +33
- Misses 12343 12370 +27
- Partials 2307 2312 +5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
c3f6cbb to
a61f9c2
Compare
a61f9c2 to
d1a1966
Compare
d1a1966 to
284d6ea
Compare
Introduce a TEIClient that calls the HuggingFace Text Embeddings Inference HTTP API for vector embeddings, replacing the need for fake/deterministic embeddings in production. The store factory now selects between a real TEI client (when a service URL is configured), a fake client (for testing), or FTS5-only search. Wire the full OptimizerConfig through the server instead of a boolean flag, so the embedding service URL and timeout are available when creating the store and client. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
284d6ea to
d6c490f
Compare
|
The PR expects the |
|
|
||
| // Timeout is the HTTP request timeout. Defaults to 30s. | ||
| Timeout time.Duration | ||
| } |
There was a problem hiding this comment.
Suggestion: delete. This contains the same information as the config.OptimizerConfig, so either pass that through to NewTEIClient or pass the two args, url and timeout, without a struct.
| } | ||
|
|
||
| // NewTEIClient creates a new TEI embedding client that calls the specified endpoint. | ||
| func NewTEIClient(cfg TEIClientConfig) (*TEIClient, error) { |
There was a problem hiding this comment.
Can we make TEIClient private?
I think we can expose fewer public details from internal/similarity by having one NewEmbeddingClient(optimizerConfig) (types.EmbeddingClient, error) function. Then, store.go doesn't have to be aware about what exact client is being constructed, it's just an implementation detail of the package.
| // embedRequest is the JSON body sent to the TEI /embed endpoint. | ||
| type embedRequest struct { | ||
| Inputs []string `json:"inputs"` | ||
| Truncate bool `json:"truncate"` |
There was a problem hiding this comment.
nit: can you comment what truncate does?
| if cfg != nil { | ||
| switch { | ||
| case cfg.EmbeddingServiceURL != "": | ||
| var err error | ||
| embClient, err = similarity.NewTEIClient(similarity.TEIClientConfig{ | ||
| BaseURL: cfg.EmbeddingServiceURL, | ||
| Timeout: cfg.EmbeddingServiceTimeout, | ||
| }) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("failed to create TEI embedding client: %w", err) | ||
| } | ||
| case cfg.EmbeddingDimension > 0: | ||
| embClient = similarity.NewFakeEmbeddingClient(cfg.EmbeddingDimension) | ||
| } |
There was a problem hiding this comment.
mentioned above, but I think we should push the client construction logic into internal/similarity
Closes: #3733
Depends on: #3839
Introduce a TEIClient that calls the HuggingFace Text Embeddings Inference HTTP API for vector embeddings, replacing the need for fake/deterministic embeddings in production. The store factory now selects between a real TEI client (when a service URL is configured), a fake client (for testing), or FTS5-only search.
Wire the full OptimizerConfig through the server instead of a boolean flag, so the embedding service URL and timeout are available when creating the store and client.