Skip to content

v0.1.2

Compare
Choose a tag to compare
@jmorganca jmorganca released this 12 Oct 18:46
· 1861 commits to main since this release

New Models

  • Zephyr A fine-tuned 7B version of mistral that was trained on a mix of publicly available, synthetic datasets and performs as well as Llama 2 70B in many benchmarks
  • Mistral OpenOrca a 7 billion parameter model fine-tuned on top of the Mistral 7B model using the OpenOrca dataset

Examples

Ollama's examples have been updated with some new examples:

What's Changed

  • Download speeds for ollama pull have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connections
  • The API now supports non-streaming responses. Set the stream parameter to false and endpoints will return data in one single response:
    curl -X POST http://localhost:11434/api/generate -d '{
      "model": "llama2",
      "prompt": "Why is the sky blue?",
      "stream": false
    }'
    
  • Ollama can now be used with http proxies (using HTTP_PROXY=http://<proxy>) and https proxies (using HTTPS_PROXY=https://<proxy>)
  • Fixed token too long error when generating a response
  • q8_0, q5_0, q5_1, and f32 models will now use GPU on Linux
  • Revise help text in ollama run to be easier to read
  • Rename runner subprocess to ollama-runner
  • ollama create will now show feedback when reading model metadata
  • Fix not found error showing when running ollama pull
  • Improved video memory allocation on Linux to fix errors when using Nvidia GPUs

New Contributors

Full Changelog: v0.1.1...v0.1.2