Added Script To Upgrade llamafile Archives #412

mofosyne · 2024-05-11T11:27:45Z

Context: #411

Porting https://briankhuu.com/blog/2024/04/06/inplace-upgrading-of-llamafiles-engine-bash-script/ to llamafile

mofosyne · 2024-05-11T11:30:07Z

Version Different, repack

$ llamafile-upgrade-engine mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile 
== Engine Version Check ==
Engine version from mistral-7b-instruct-v0.1-Q4_K_M-server: llamafile v0.4.1
Engine version from /usr/local/bin/llamafile: llamafile v0.8.4
== Repackaging / Upgrading ==
extracting...
Archive:  mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile
  inflating: /tmp/tmp.FtvmAfSWty/.symtab.amd64  
  inflating: /tmp/tmp.FtvmAfSWty/.symtab.arm64  
  inflating: /tmp/tmp.FtvmAfSWty/llamafile/compcap.cu  
  inflating: /tmp/tmp.FtvmAfSWty/llamafile/llamafile.h  
  inflating: /tmp/tmp.FtvmAfSWty/llamafile/tinyblas.cu  
  inflating: /tmp/tmp.FtvmAfSWty/llamafile/tinyblas.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-alloc.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-backend-impl.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-backend.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-cuda.cu  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-cuda.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-impl.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.m  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.metal  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-quants.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml.h  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/completion.js  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/index.html  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/index.js  
  inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/json-schema-to-grammar.mjs  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Anchorage  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Beijing  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Berlin  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Boulder  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Chicago  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/GMT  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/GST  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Honolulu  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Israel  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Japan  
 extracting: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/London  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Melbourne  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/New_York  
  inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/UTC  
 extracting: /tmp/tmp.FtvmAfSWty/.cosmo  
 extracting: /tmp/tmp.FtvmAfSWty/.args  
 extracting: /tmp/tmp.FtvmAfSWty/mistral-7b-instruct-v0.1.Q4_K_M.gguf  
 extracting: /tmp/tmp.FtvmAfSWty/ggml-cuda.dll  
repackaging...
== Completed ==
Original File: mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile
Upgraded File: mistral-7b-instruct-v0.1-Q4_K_M-server.updated.llamafile

Version same

$ llamafile-upgrade-engine test.llamafile 
== Engine Version Check ==
Engine version from test: llamafile v0.8.4
Engine version from /usr/local/bin/llamafile: llamafile v0.8.4
Upgrade not required. Exiting...

Help Message

$ llamafile-upgrade-engine --help
Usage: llamafile-upgrade-engine [OPTION]... <old> (new)
Upgrade llamafile archives.

Options:
  -h, --help            display this help and exit
  -f, --force           skip version check
  -v, --verbose         verbose mode

Arguments:
  <old>                 the name of the old llamafile archive to be upgraded
  (new)                 the name of the new llamafile archive to be created
                        if not defined output will be <old>.updated.llamafile

Example:
  llamafile-upgrade-engine old.llamafile new.llamafile
  This command will upgrade the old_llamafile to a new llamafile named new_llamafile.

When you run this program, it's recommended that you've
downloaded or installed an official llamafile-VERSION.zip
from https://github.com/Mozilla-Ocho/llamafile/releases
because they include prebuilt DLLs for CUDA and ROCm.
You can verify your llamafile has them w/ unzip -vl

mofosyne · 2024-05-12T07:01:57Z

@jart . This is ready whenever now. I recall you also mentioned about build systems hooks, but not sure if relevant here. (edit: Ah i see what you mean. Added another commit)

jart

Looks good. Thank you!

Added Script To Upgrade llamafile Archives

cb3533c

mofosyne mentioned this pull request May 11, 2024

Porting "Inplace Upgrading Of Llamafiles Engine Bash Script" to llamafile for general usage #411

Closed

llamafile-upgrade-engine: add abort message and refactor

5e263d2

mofosyne added 2 commits May 12, 2024 19:10

llamafile-upgrade-engine: add to makefile

bd5315c

llamafile-upgrade-engine: avoid needing to append ./ to model filename

e096cdd

jart approved these changes May 13, 2024

View reviewed changes

jart merged commit a86e7ce into Mozilla-Ocho:main May 13, 2024

mofosyne deleted the llamafile-upgrade-engine branch May 13, 2024 04:33

mofosyne mentioned this pull request May 13, 2024

Server Missing OpenAI API Support? #24

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Script To Upgrade llamafile Archives #412

Added Script To Upgrade llamafile Archives #412

mofosyne commented May 11, 2024

mofosyne commented May 11, 2024 •

edited

mofosyne commented May 12, 2024 •

edited

jart left a comment

Added Script To Upgrade llamafile Archives #412

Added Script To Upgrade llamafile Archives #412

Conversation

mofosyne commented May 11, 2024

mofosyne commented May 11, 2024 • edited

Version Different, repack

Version same

Help Message

mofosyne commented May 12, 2024 • edited

jart left a comment

Choose a reason for hiding this comment

mofosyne commented May 11, 2024 •

edited

mofosyne commented May 12, 2024 •

edited