balena-serge

A chat interface based on llama.cpp for running Alpaca models.

Entirely self-hosted, no API keys needed. Fits on 4GB of RAM and runs on the CPU.

You can read more on the official project README.

Hardware required

LLaMA will just crash if you don't have enough available memory for your model.

7B requires about 4.5GB of free RAM
13B requires about 12GB free
30B requires about 20GB free

I've tested on Intel NUC, but any amd64 or aarch64 device with at least 5GB of memory should work!

In theory Raspberry Pi 4 8GB model should work but I haven't tried it myself!

Getting Started

You can one-click-deploy this project to balena using the button below:

Manual Deployment

Alternatively, deployment can be carried out by manually creating a balenaCloud account and application, flashing a device, downloading the project and pushing it via the balena CLI.

Environment Variables

Name	Default	Purpose
`TZ`	`UTC`	The timezone in your location. Find a list of all timezone values here.

Usage

Once your device joins the fleet you'll need to allow some time for it to download the application containers.

When it's done you should be able to access the app on port 80 of the device.

You can read more on the official project README.

Contributing

Please open an issue or submit a pull request with any features, fixes, or changes.

Name		Name	Last commit message	Last commit date
Latest commit History 235 Commits
.github		.github
README.md		README.md
balena.yml		balena.yml
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

README.md

README.md

balena.yml

balena.yml

docker-compose.yml

docker-compose.yml

Repository files navigation

balena-serge

Hardware required

Getting Started

Manual Deployment

Environment Variables

Usage

Contributing

About

Releases

Packages

Contributors 3

klutchell/balena-serge

Folders and files

Latest commit

History

Repository files navigation

balena-serge

Hardware required

Getting Started

Manual Deployment

Environment Variables

Usage

Contributing

About

Resources

Stars

Watchers

Forks