cluster-inference

This version of cluster-inference is designed to locally replicate the OpenAI API from within a docker container using open source LLMs running GGML.

It costs nothing these days to aquire a used dual-Xeon rack server with a couple hundred gigs of ram. Tools like GGML allow us to run large language models on these machines without needing any graphics cards. Sure, it's slower, but we're not global conglomerates with high performance demands for immediate results from the model. We can afford to queue tasks up and let them run overnight.

The potential benefits of combining this approach with something like AutoGPT can not be overestimated.

##Setup

# Put your ggml models into the /var/ai directory on your docker server. I use BitTorrent Sync to get them there easily from my NAS.

# Clone this repository onto your docker server
git clone https://github.com/cjtrowbridge/cluster-inference

# Build the dockerfile
cd cluster-inference
make build

# Open the web ui to view your models and run inference.
http://dockerHostname:800

##Coming Soon

I want to incorporate bittorrent support for model distribution which just seems so obvious to me.

##Keep In Mind

A lot of attention is being paid to extending AI models into edge cases, and all of this work benefits homelabs with a lot more CPU cores than GPU cores.

For best results, you want CPUs with AVX, AVX2, or AVX512 instruction sets. My favorite so far is the Poweredge R630 with the E5-2600 v4. This comes with24 Xeon cores and it's easy to find them with 192 gigs of ram or more for just a few hundred bucks. This will run any of the publicly available large langauge models with ease. I get 2-3 tokens/second running MPT-7B-Instruct on this hardware with no graphics card whatsoever.

There are also builds of tensorflow and GGML for non-avx CPUs (they're just going to run much slower).

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
img		img
js		js
php		php
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
apache2.conf		apache2.conf
index.php		index.php
setup.sh		setup.sh
sshd_config		sshd_config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

js

js

php

php

Dockerfile

Dockerfile

Makefile

Makefile

README.md

README.md

apache2.conf

apache2.conf

index.php

index.php

setup.sh

setup.sh

sshd_config

sshd_config

Repository files navigation

cluster-inference

About

Releases

Packages

Languages

cjtrowbridge/cluster-inference

Folders and files

Latest commit

History

Repository files navigation

cluster-inference

About

Resources

Stars

Watchers

Forks

Languages