Skip to content

Add real TEI embedding client for semantic tool discovery#3844

Draft
aponcedeleonch wants to merge 1 commit intomainfrom
issue-3733/pr2-real-tei-client
Draft

Add real TEI embedding client for semantic tool discovery#3844
aponcedeleonch wants to merge 1 commit intomainfrom
issue-3733/pr2-real-tei-client

Conversation

@aponcedeleonch
Copy link
Member

@aponcedeleonch aponcedeleonch commented Feb 17, 2026

Closes: #3733
Depends on: #3839

Introduce a TEIClient that calls the HuggingFace Text Embeddings Inference HTTP API for vector embeddings, replacing the need for fake/deterministic embeddings in production. The store factory now selects between a real TEI client (when a service URL is configured), a fake client (for testing), or FTS5-only search.

Wire the full OptimizerConfig through the server instead of a boolean flag, so the embedding service URL and timeout are available when creating the store and client.

@github-actions github-actions bot added the size/M Medium PR: 300-599 lines changed label Feb 17, 2026
@codecov
Copy link

codecov bot commented Feb 17, 2026

Codecov Report

❌ Patch coverage is 64.47368% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.73%. Comparing base (54d55a1) to head (d6c490f).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
...g/vmcp/optimizer/internal/similarity/tei_client.go 75.00% 6 Missing and 6 partials ⚠️
pkg/vmcp/optimizer/store.go 16.66% 10 Missing ⚠️
pkg/vmcp/server/server.go 73.33% 2 Missing and 2 partials ⚠️
cmd/vmcp/app/commands.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3844      +/-   ##
==========================================
- Coverage   66.75%   66.73%   -0.03%     
==========================================
  Files         444      445       +1     
  Lines       44065    44130      +65     
==========================================
+ Hits        29415    29448      +33     
- Misses      12343    12370      +27     
- Partials     2307     2312       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@aponcedeleonch aponcedeleonch force-pushed the issue-3733/pr2-real-tei-client branch from c3f6cbb to a61f9c2 Compare February 17, 2026 13:47
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Feb 17, 2026
@aponcedeleonch aponcedeleonch force-pushed the issue-3733/pr2-real-tei-client branch from a61f9c2 to d1a1966 Compare February 17, 2026 15:11
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Feb 17, 2026
@aponcedeleonch aponcedeleonch force-pushed the issue-3733/pr2-real-tei-client branch from d1a1966 to 284d6ea Compare February 17, 2026 15:16
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Feb 17, 2026
Introduce a TEIClient that calls the HuggingFace Text Embeddings Inference
HTTP API for vector embeddings, replacing the need for fake/deterministic
embeddings in production. The store factory now selects between a real TEI
client (when a service URL is configured), a fake client (for testing), or
FTS5-only search.

Wire the full OptimizerConfig through the server instead of a boolean flag,
so the embedding service URL and timeout are available when creating the
store and client.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aponcedeleonch aponcedeleonch force-pushed the issue-3733/pr2-real-tei-client branch from 284d6ea to d6c490f Compare February 17, 2026 17:26
@github-actions github-actions bot added size/M Medium PR: 300-599 lines changed and removed size/M Medium PR: 300-599 lines changed labels Feb 17, 2026
@aponcedeleonch
Copy link
Member Author

The PR expects the EmbeddingServer to be ready and provided. Such work is introduced in #3839 . It's better to wait for that to get merged and then we can proceed with this one. Although the code on this shouldn't change a great deal. Will mark this PR as draft for the moment to not merge it accidentally

@aponcedeleonch aponcedeleonch marked this pull request as draft February 17, 2026 18:18

// Timeout is the HTTP request timeout. Defaults to 30s.
Timeout time.Duration
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: delete. This contains the same information as the config.OptimizerConfig, so either pass that through to NewTEIClient or pass the two args, url and timeout, without a struct.

}

// NewTEIClient creates a new TEI embedding client that calls the specified endpoint.
func NewTEIClient(cfg TEIClientConfig) (*TEIClient, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make TEIClient private?

I think we can expose fewer public details from internal/similarity by having one NewEmbeddingClient(optimizerConfig) (types.EmbeddingClient, error) function. Then, store.go doesn't have to be aware about what exact client is being constructed, it's just an implementation detail of the package.

// embedRequest is the JSON body sent to the TEI /embed endpoint.
type embedRequest struct {
Inputs []string `json:"inputs"`
Truncate bool `json:"truncate"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you comment what truncate does?

Comment on lines +126 to +139
if cfg != nil {
switch {
case cfg.EmbeddingServiceURL != "":
var err error
embClient, err = similarity.NewTEIClient(similarity.TEIClientConfig{
BaseURL: cfg.EmbeddingServiceURL,
Timeout: cfg.EmbeddingServiceTimeout,
})
if err != nil {
return nil, fmt.Errorf("failed to create TEI embedding client: %w", err)
}
case cfg.EmbeddingDimension > 0:
embClient = similarity.NewFakeEmbeddingClient(cfg.EmbeddingDimension)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mentioned above, but I think we should push the client construction logic into internal/similarity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wire a real embedding client for the EmbeddingServer into the optimizer

2 participants