Hi,
First off, thanks for the OPENBLAS tip. That cuts down the initial prompt processing time by like 3-4x!
Was wondering if its possible to use the generate function as an api from another python file.
Secondly, is it possible to update to the latest llama.cpp with a git pull from the llama.cpp library or do I have to wait for you to sync changes and then git pull from koboldcpp.
Hi,
First off, thanks for the OPENBLAS tip. That cuts down the initial prompt processing time by like 3-4x!
Was wondering if its possible to use the generate function as an api from another python file.
Secondly, is it possible to update to the latest llama.cpp with a git pull from the llama.cpp library or do I have to wait for you to sync changes and then git pull from koboldcpp.