Skip to content

Releases: AmpereComputingAI/llama.cpp

v1.2.3

02 Jul 22:47
855aa8d
Compare
Choose a tag to compare
  • The rebase is to allow llama-cpp-python to pick up upstream CVE fix (GHSA-56xg-wfcc-g829)
  • Experimental support for Q8R16 quantized format with optimized matrix multiplication kernels
  • CMake files updated to build llama.aio on AmpereOne

v1.2.2

21 May 19:07
Compare
Choose a tag to compare

Release notes:

  • Fix llama-3 end of token issue
  • Update server to support ollama (v0.1.33)
  • llama.aio docker support server mode by default

SHA-256 hashes:

  • 6c580006a8faf7b73a424b0020f1bda2684aa7e1796182f68bfa8b7fee08d991 llama_cpp_python-0.2.63-cp311-cp311-linux_aarch64.whl
  • 1ffde8093abe18f638fb89273dd56664dd7ff6b8c82383099ea620d18ab562a7 llama_aio_v1.2.2_b769bc1.tar.gz