Web server implementation of Llama. It's able to load and run a GGUF model file locally and provides a UI similar to WhatsApp. You can download a GGUF model file from HuggingFace.co and place it in the model
folder or run an npm
command that will do it for you.
- Run
npm install
- Run
npm run download:q8
ornpm run download:q3
(lightweight AI model) - Run
npm run start
- Browse to
http://localhost
- Run
npm run start 8080
- Browse to
http://localhost:8080
- Install Forever:
npm install -g forever
- Start the server:
npm run forever
- Stop the server:
npm run stop
- Browse to
http://localhost/?lightmode
- Browse to
http://localhost/?darkmode
The characters and rules are defined in the characters.json file.
{
"characters": {
"Assistant": {
"system_prompt": "You are a useful AI assistant.",
"welcome_message": "Hello, how can I help you today?"
}
},
"rules": "You are not allowed to provide illegal advices or inappropriate content."
}
You are legally responsible for any damage that you could cause with this software.