Skip to content

Conversation

ThomasVitale
Copy link
Contributor

  • Introduce support for Ollama model auto-pull at startup time
  • Enhance support for Ollama model auto-pull at run time
  • Update documentation about integrating with Ollama and managing models
  • Adopt Builder pattern in Ollama Model classes for better code readability
  • Unify Ollama model auto-pull functionality in production and test code
  • Improve integration tests for Ollama with Testcontainers

* Introduce support for Ollama model auto-pull at startup time
* Enhance support for Ollama model auto-pull at run time
* Update documentation about integrating with Ollama and managing models
* Adopt Builder pattern in Ollama Model classes for better code readability
* Unify Ollama model auto-pull functionality in production and test code
* Improve integration tests for Ollama with Testcontainers
FunctionCallbackContext functionCallbackContext) {
this(ollamaApi, defaultOptions, functionCallbackContext, List.of());
}
private ChatModelObservationConvention observationConvention = DEFAULT_OBSERVATION_CONVENTION;

public OllamaChatModel(OllamaApi ollamaApi, OllamaOptions defaultOptions,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument list grew so much that I didn't want to add even more overloaded constructors. Instead, I introduced a Builder to help making this whole initialisation code more readable.

public OllamaEmbeddingModel(OllamaApi ollamaApi, OllamaOptions defaultOptions) {
this(ollamaApi, defaultOptions, ObservationRegistry.NOOP);
}
private EmbeddingModelObservationConvention observationConvention = DEFAULT_OBSERVATION_CONVENTION;

public OllamaEmbeddingModel(OllamaApi ollamaApi, OllamaOptions defaultOptions,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, I introduced a Builder

@@ -954,13 +957,15 @@ public record ProgressResponse(
* Download a model from the Ollama library. Cancelled pulls are resumed from where they left off,
* and multiple calls will share the same download progress.
*/
public ProgressResponse pullModel(PullModelRequest pullModelRequest) {
return this.restClient.post()
public Flux<ProgressResponse> pullModel(PullModelRequest pullModelRequest) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the streaming option is grew because it gives us continuous status updates on the download, that we can log to keep the user up-to-date. It also makes it easy to define timeouts and retries.

*/
public enum PullModelStrategy {

/**
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When pulling models, there are two options here: always (latest model version) and when_missing (the model could be stale).

ollamaApi.pullModel(new OllamaApi.PullModelRequest(model));
logger.info("Completed pulling the '{}' model", model);
var ollamaModelManager = new OllamaModelManager(ollamaApi);
ollamaModelManager.pullModel(model, PullModelStrategy.WHEN_MISSING);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The integration test setup now uses the same auto-pull functionality as the production code.

@@ -52,23 +52,24 @@ class OllamaChatModelFunctionCallingIT extends BaseOllamaIT {

private static final Logger logger = LoggerFactory.getLogger(OllamaChatModelFunctionCallingIT.class);

private static final String MODEL = OllamaModel.LLAMA3_1.getName();
private static final String MODEL = "qwen2.5:3b";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got many failures with llama3.1, even after refining the prompt. With qwen2.5:3b I got all green results, and it's even a smaller model (so better for integration testing).

logger.info("Pulling the '{}' model - Status: {}", modelName, progressResponses.get(progressResponses.size() - 1).status());
}
})
.takeUntil(progressResponses -> progressResponses.get(0) != null && progressResponses.get(0).status().equals("success"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we continuously print out the current status of the download, and the whole operation is configured with timeout and retry.

@ConfigurationProperties(OllamaInitializationProperties.CONFIG_PREFIX)
public class OllamaInitializationProperties {

public static final String CONFIG_PREFIX = "spring.ai.ollama.init";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The naming tries to be consistent with other similar Spring Boot features like spring.sql.init to initialise database schemas.

@tzolov tzolov self-assigned this Oct 18, 2024
@tzolov tzolov added this to the 1.0.0-M4 milestone Oct 18, 2024
@tzolov tzolov added the ollama label Oct 18, 2024
@tzolov
Copy link
Contributor

tzolov commented Oct 18, 2024

Great stuff @ThomasVitale , thanks.
I've been thinking that perhaps we can allow listing multiple models (additional property in OllamaInitializationProperties) to be pulled at init time?
We can discuss this for future improvements.

@tzolov
Copy link
Contributor

tzolov commented Oct 18, 2024

rebased and merged at 8eef6e6

@tzolov tzolov closed this Oct 18, 2024
@ThomasVitale
Copy link
Contributor Author

@tzolov thanks! I like the idea of the list of models. I'll work on a followup PR, including also a couple of improvements to add a bit more flexibility.

@ThomasVitale
Copy link
Contributor Author

@tzolov I have made some improvements and added the possibility to pull an explicit list of models: #1566

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants