Release v0.1.23 · ollama/ollama

New vision models

The LLaVA model family on Ollama has been updated to version 1.6, and now includes a new 34b version:

ollama run llava A new 7B LLaVA model based on mistral.
ollama run llava:13b 13B LLaVA model
ollama run llava:34b 34B LLaVA model – one of the most powerful open-source vision models available

These new models share new improvements:

More permissive licenses: LLaVA 1.6 models are distributed via the Apache 2.0 license or the LLaMA 2 Community License.
Higher image resolution: support for up to 4x more pixels, allowing the model to grasp more details.
Improved text recognition and reasoning capabilities: these models are trained on additional document, chart and diagram data sets.

When making API requests, the new keep_alive parameter can be used to control how long a model stays loaded in memory:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "Why is the sky blue?",
  "keep_alive": "30s"
}'

If set to a positive duration (e.g. 20m, 1hr or 30), the model will stay loaded for the provided duration
If set to a negative duration (e.g. -1), the model will stay loaded indefinitely
If set to 0, the model will be unloaded immediately once finished
If not set, the model will stay loaded for 5 minutes by default

New keep_alive API parameter to control how long models stay loaded
Image paths can now be provided to ollama run when running multimodal models
Fixed issue where downloading models via ollama pull would slow down to 99%
Fixed error when running Ollama with Nvidia GPUs and CPUs without AVX instructions
Support for additional Nvidia GPUs (compute capability 5)
Fixed issue where system prompt would be repeated in subsequent messages
ollama serve will now print prompt when OLLAMA_DEBUG=1 is set
Fixed issue where exceeding context size would cause erroneous responses in ollama run and the /api/chat API
ollama run will now allow sending messages without images to multimodal models

Full Changelog: v0.1.22...v0.1.23