You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're currently using Microsoft.SemanticKernel.Connectors.Ollama (v1.54.0-alpha) to integrate local LLMs via Ollama. While the experience is great overall, we noticed that the current extension methods:
AddOllamaEmbeddingGenerator(...)
AddOllamaChatCompletion(...)
do not allow passing a custom HttpClient, which makes it impossible to configure critical options such as Timeout, custom headers, retry policies, or advanced diagnostics.
This is a limitation especially for local LLMs like llama3.2:3b or mistral, which may take longer than 100 seconds on first generation, leading to TaskCanceledException.
We kindly suggest either:
Adding overloads that support injecting a preconfigured HttpClient
Or allowing DI via options/configuration pattern
This change would align with .NET best practices and enable full control over networking behavior when using Ollama.