Replace hardcoded tweaks to llama.cpp with a more permanent solution #2

nsarrazin · 2023-03-21T07:08:25Z

Currently the compiled llama.cpp binary we use only supports alpaca. The source had to be modified to accept a model as a single file (alpaca 13B is a single file, as opposed to the 2-part model expected for LLaMa 13B). But doing so breaks compatibility for other LLaMa based models.

Relevant changes here.
https://github.com/nsarrazin/serge/blob/a837ea48e017289a21a9574b0fe862f541874a14/api/Dockerfile.api#L18-L20

We could make this more generic, but maybe it needs to be handled in llama.cpp instead ? Not sure yet.

The text was updated successfully, but these errors were encountered:

nsarrazin · 2023-03-22T11:21:10Z

Looks like I spoke too soon, still need this fix for loading the 13B model.

nsarrazin added enhancement labels Mar 21, 2023

nsarrazin closed this as completed in 0dfd0f2 Mar 22, 2023

nsarrazin reopened this Mar 22, 2023

nsarrazin mentioned this issue Mar 22, 2023

Add an auto-download script that supports 7B, 13B & 30B models #14

Merged

nsarrazin closed this as completed in #14 Mar 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace hardcoded tweaks to llama.cpp with a more permanent solution #2

Replace hardcoded tweaks to llama.cpp with a more permanent solution #2

nsarrazin commented Mar 21, 2023

nsarrazin commented Mar 22, 2023

Replace hardcoded tweaks to llama.cpp with a more permanent solution #2

Replace hardcoded tweaks to llama.cpp with a more permanent solution #2

Comments

nsarrazin commented Mar 21, 2023

nsarrazin commented Mar 22, 2023