Web server implementation of Llama. It's able to load and run a GGUF model file locally and provides a UI similar to WhatsApp. You can download a GGUF model file from HuggingFace.co and place it in the model
folder or run an npm
command that will do it for you.
- Run
npm install
- Run
npm run download:q8
ornpm run download:q3
(lightweight AI model) - Run
npm run start
- Browse to
http://localhost
- Run
npm run start 8080
- Browse to
http://localhost:8080
- Install Forever:
npm install -g forever
- Start the server:
npm run forever
- Stop the server:
npm run stop
- Browse to
http://localhost/?lightmode
- Browse to
http://localhost/?darkmode
The system prompt is defined in the strings.js file.
You are legally responsible for any damage that you could cause with this software.