Implement dynamic Ollama embedding dimension resolution with server probing#237
Merged
Merged
Conversation
Co-authored-by: phact <1313220+phact@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Update embedding dimensions to be dynamic from Ollama API
Implement dynamic Ollama embedding dimension resolution with server probing
Oct 9, 2025
…ding-dimension-resolution
phact
approved these changes
Oct 9, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces hardcoded Ollama embedding dimensions with dynamic resolution that probes the running Ollama server to determine the actual vector dimension for the selected embedding model. Maintains static maps and default fallback for unknown models or failure cases.
Motivation
Currently,
OLLAMA_EMBEDDING_DIMENSIONSinsrc/config/settings.pycontains a hardcoded subset of model→dimension mappings. This approach is brittle and quickly becomes stale as new models are released or custom models are used. Ollama exposes an embeddings API that returns the embedding vector, allowing us to infer the true dimension at runtime.Changes
src/utils/embeddings.py (+120 lines)
New async functions:
_probe_ollama_embedding_dimension(endpoint, model_name)- Probes the Ollama server's/api/embeddingsendpoint with a test string. Tries modern API format ({model, input}) first, then falls back to legacy format ({model, prompt}). Returns the embedding dimension from the response or 0 on failure.resolve_embedding_dimension(embedding_model, provider, endpoint)- Main resolution function that conditionally probes Ollama when provider and endpoint are provided. Falls back to static maps viaget_embedding_dimensions(), then toVECTOR_DIM(1536) for unknown models.Modified function:
create_dynamic_index_body(embedding_model, provider, endpoint)- Changed from sync to async and added optionalproviderandendpointparameters. Now callsawait resolve_embedding_dimension()instead of direct static lookup.src/main.py (+7 lines)
Modified
init_index():Behavior
Ollama with endpoint configured
{endpoint}/api/embeddingsduring index creationVECTOR_DIM(1536)Non-Ollama providers (OpenAI, watsonx)
VECTOR_DIMfor unknown modelsError handling
All errors (network timeout, connection refused, invalid JSON, missing fields) result in graceful fallback with logging. No exceptions propagate to prevent index creation failures.
Benefits
Testing
Validated behavior across 7 scenarios:
Deployment
Original prompt
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.