layout | title | date | categories | tags | image | ||||
---|---|---|---|---|---|---|---|---|---|
post |
How to Self-Host Your Own Private AI Stack |
2024-07-08 08:00:00 -0500 |
homelab self-hosted |
ai docker software llm ollama whisper |
|
In this tutorial we'll walk through my local, private, self-hosted AI stack so that you can run it too.
{% include embed/youtube.html id='yoze1IxdBdM' %} 📺 Watch Video
- Nothing in this video was sponsored
If you're looking for the overview of this stack, you can out the video here Self-Hosted AI That's Actually Useful
- Machine Used for AI: Building My ULTIMATE, All-inOne, HomeLab Server
Here are some cards that are good for local AI. Keep in mind that it's always better to get a newer one for better CUDA support and more RAM. 8GB of RAM should be good for small models like the ones in this stack, however 12-24 is probably best.
- MSI Gaming GeForce RTX 3060 12GB
- GIGABYTE GeForce RTX 3060 Gaming OC 12G
- NVIDIA GeForce RTX 3090 Founders Edition Graphics Card (Renewed) 24GB
- Zotac Gaming GeForce RTX 3090 Trinity OC, 24GB
- ASUS TUF Gaming GeForce RTX™ 4090 OG OC Edition Gaming Graphics Card 24GB
- MSI Gaming GeForce RTX 4060 8GB
- ASUS Dual GeForce RTX™ 4070 White OC Edition 12GB
You'll want a modern CPU, if you are going desktop class here are a few I would choose
- Intel Core i7-12700K Gaming Desktop Processor
- Intel Core i7-13700K Gaming Desktop Processor
- Intel Core i7-14700K Gaming Desktop Processor
For flash storage, I always go with these SSDs
I am running Ubuntu Server 24.04 LTS
Installing NVIDIA Drivers
If you need help, you can check out this article but here are the commands I ran.
Install the best desktop graphics card for your machine.
sudo ubuntu-drivers install
Install NVIDIA tools
Be sure you install the version that matches your driver from above
sudo apt install nvidia-utils-535
Then reboot your machine
sudo reboot
Once the machine is back up, check to be sure your drivers are functioning properly
nvidia-smi
Here are the packages and repo's we're be using
- Ollama
- Open WebUI
- ComfyUI
- Stable Diffusion web UI
- pluja/whishper
- HuggingFace
- Home Assistant Wyoming Protocol
- Continue Code Assistant
- searXNG
- MacWhisper
I am using Traefik as the only entry point into this stack. No ports are exposed on the host. If you don't want to use traefik, just comment out the labels (and optionally rename the network named traefik
). You will then need to expose ports for open-webui
, stable-diffusion-webui
, and whisper
in your Docker compose file.
If you need help installing Traefik, see this post on installing traefik 3 on Docker
Note: If using
traefik
(or any reverse proxy, remember that all of your internal DNS records will point to this machine! e.g. If the machine running this stack's ip is192.168.0.100
you'll need a DNS record likechat.local.example.com
that points to192.168.0.100
{: .prompt-info }
This stack contains middleware for basic auth so that the Ollama so that it is secure with a username and password. This is optional. If you don't want to use basic auth, just remove the auth middleware labels from the ollama
service in your compose.
Otherwise, here's how you create the credential:
Hashing your password for traefik
middleware
echo $(htpasswd -nB ollamauser) | sed -e s/\\$/\\$\\$/g
You'll then want to place this in your .env
here using the OLLAMA_API_CREDENTIALS
variable. This is then used in the ollama
service in your compose file.
If you want to create a hash value for Basic Auth (Used for Continue extension). You'll need to use the credential from above.
echo 'ollamauser:ollamapassword!' | base64
If you run into issues, you can always visit the NVIDIA Container Toolkit
Configure the production repository
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
Update the packages list from the repository:
sudo apt-get update
Install the NVIDIA Container Toolkit packages
sudo apt-get install -y nvidia-container-toolkit
Configure the container runtime by using the nvidia-ctk
command and restart docker
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
This will fail if you don't have Docker installed yet.
If you need to install Docker see this post on how to install docker
and docker compose
After installing Docker you will need to reconfigure the runtime
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
This will test to make sure that the NVIDIA container toolkit can access the NVidia driver.
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
You should see the same output as running nvidia-smi
without Docker.
Stacks live in /opt/stacks
Here is the folder structure. Most subfolders are created when binding volumes.
├── ai-stack
│ ├── .env
│ ├── compose.yaml
│ ├── ollama
│ ├── open-webui
│ ├── searxng
│ ├── stable-diffusion-webui-docker
│ └── whisper
├── home-assistant-stack
│ ├── compose.yaml
│ ├── faster-whisper
│ ├── home-assistant
│ └── wyoming-piper
If you run into any folder permission errors while running any of this, you can simple change the owner to yourself using the command. Please replace the user and group with your own user and group.
sudo chown serveradmin:serveradmin -R /opt/stacks
My ai-stack
.env
is pretty minimal
OLLAMA_API_CREDENTIALS=
DB_USER=
DB_PASS=
WHISHPER_HOST=https://whisper.local.example.com
WHISPER_MODELS=tiny,small
PUID=
PGID=
Here is my compose.yaml
You'll want to create this in the root of your stack folder (see folder structure above)
The command I use to start, build, and remove orphans is:
docker compose up -d --build --force-recreate --remove-orphans
otherwise you can use
docker compose up -d --build
There are additional steps you'll need to do before starting this stack. Please continue on to the end.
Here are 2 Docker compose files that you can use on your system.
The stack is the one I use in the video as well as at home. If you want to use the general stack without traefik and macvlan, see the general Docker compose stack
Before running this, you will need to create the network for Docker to use.
This might already exist if you are using traefik. If so skip this step.
docker network create traefik
This will create the macvlan
network. Adjust accordingly.
docker network create -d macvlan \
--subnet=192.168.20.0/24 \
--gateway=192.168.20.1 \
-o parent=eth1 \
iot_macvlan
compose.yaml
services:
# Ollama
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- OLLAMA_KEEP_ALIVE=24h
- ENABLE_IMAGE_GENERATION=True
- COMFYUI_BASE_URL=http://stable-diffusion-webui:7860
networks:
- traefik
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./ollama:/root/.ollama
labels:
- "traefik.enable=true"
- "traefik.http.routers.ollama.rule=Host(`ollama.local.example.com`)"
- "traefik.http.routers.ollama.entrypoints=https"
- "traefik.http.routers.ollama.tls=true"
- "traefik.http.routers.ollama.tls.certresolver=cloudflare"
- "traefik.http.routers.ollama.middlewares=default-headers@file"
- "traefik.http.routers.ollama.middlewares=ollama-auth"
- "traefik.http.services.ollama.loadbalancer.server.port=11434"
- "traefik.http.routers.ollama.middlewares=auth"
- "traefik.http.middlewares.auth.basicauth.users=${OLLAMA_API_CREDENTIALS}"
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
# open web ui
open-webui:
image: ghcr.io/open-webui/open-webui:latest
container_name: open-webui
restart: unless-stopped
networks:
- traefik
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- 'OLLAMA_BASE_URL=http://ollama:11434'
- ENABLE_RAG_WEB_SEARCH=True
- RAG_WEB_SEARCH_ENGINE=searxng
- RAG_WEB_SEARCH_RESULT_COUNT=3
- RAG_WEB_SEARCH_CONCURRENT_REQUESTS=10
- SEARXNG_QUERY_URL=http://searxng:8080/search?q=<query>
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./open-webui:/app/backend/data
labels:
- "traefik.enable=true"
- "traefik.http.routers.open-webui.rule=Host(`chat.local.example.com`)"
- "traefik.http.routers.open-webui.entrypoints=https"
- "traefik.http.routers.open-webui.tls=true"
- "traefik.http.routers.open-webui.tls.certresolver=cloudflare"
- "traefik.http.routers.open-webui.middlewares=default-headers@file"
- "traefik.http.services.open-webui.loadbalancer.server.port=8080"
depends_on:
- ollama
extra_hosts:
- host.docker.internal:host-gateway
searxng:
image: searxng/searxng:latest
container_name: searxng
networks:
- traefik
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./searxng:/etc/searxng
depends_on:
- ollama
- open-webui
restart: unless-stopped
# stable diffusion
stable-diffusion-download:
build: ./stable-diffusion-webui-docker/services/download/
image: comfy-download
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./stable-diffusion-webui-docker/data:/data
stable-diffusion-webui:
build: ./stable-diffusion-webui-docker/services/comfy/
image: comfy-ui
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- CLI_ARGS=
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./stable-diffusion-webui-docker/data:/data
- ./stable-diffusion-webui-docker/output:/output
stop_signal: SIGKILL
tty: true
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [compute, utility]
restart: unless-stopped
networks:
- traefik
labels:
- "traefik.enable=true"
- "traefik.http.routers.stable-diffusion.rule=Host(`stable-diffusion.local.example.com`)"
- "traefik.http.routers.stable-diffusion.entrypoints=https"
- "traefik.http.routers.stable-diffusion.tls=true"
- "traefik.http.routers.stable-diffusion.tls.certresolver=cloudflare"
- "traefik.http.services.stable-diffusion.loadbalancer.server.port=7860"
- "traefik.http.routers.stable-diffusion.middlewares=default-headers@file"
# whisper
mongo:
image: mongo
env_file:
- .env
networks:
- traefik
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/db_data:/data/db
- ./whisper/db_data/logs/:/var/log/mongodb/
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- MONGO_INITDB_ROOT_USERNAME=${DB_USER:-whishper}
- MONGO_INITDB_ROOT_PASSWORD=${DB_PASS:-whishper}
command: ['--logpath', '/var/log/mongodb/mongod.log']
translate:
container_name: whisper-libretranslate
image: libretranslate/libretranslate:latest-cuda
env_file:
- .env
networks:
- traefik
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/libretranslate/data:/home/libretranslate/.local/share
- ./whisper/libretranslate/cache:/home/libretranslate/.local/cache
user: root
tty: true
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- LT_DISABLE_WEB_UI=True
- LT_LOAD_ONLY=${LT_LOAD_ONLY:-en,fr,es}
- LT_UPDATE_MODELS=True
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
whisper:
container_name: whisper
pull_policy: always
image: pluja/whishper:latest-gpu
env_file:
- .env
networks:
- traefik
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/uploads:/app/uploads
- ./whisper/logs:/var/log/whishper
- ./whisper/models:/app/models
restart: unless-stopped
labels:
- "traefik.enable=true"
- "traefik.http.routers.whisper.rule=Host(`whisper.local.example.com`)"
- "traefik.http.routers.whisper.entrypoints=https"
- "traefik.http.routers.whisper.tls=true"
- "traefik.http.routers.whisper.tls.certresolver=cloudflare"
- "traefik.http.services.whisper.loadbalancer.server.port=80"
- "traefik.http.routers.whisper.middlewares=default-headers@file"
depends_on:
- mongo
- translate
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- PUBLIC_INTERNAL_API_HOST=${WHISHPER_HOST}
- PUBLIC_TRANSLATION_API_HOST=${WHISHPER_HOST}
- PUBLIC_API_HOST=${WHISHPER_HOST:-}
- PUBLIC_WHISHPER_PROFILE=gpu
- WHISPER_MODELS_DIR=/app/models
- UPLOAD_DIR=/app/uploads
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
networks:
traefik:
external: true
This Docker compose stack does not use traefik and also exposes the port on the host for each service. If you don't want to expose the port, comment that section out. If you want to use the stack with traefik and macvlan, see the stack I used in the video
Before running this, you will need to create the network for Docker to use.
docker network create ai-stack
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- OLLAMA_KEEP_ALIVE=24h
- ENABLE_IMAGE_GENERATION=True
- COMFYUI_BASE_URL=http://stable-diffusion-webui:7860
networks:
- ai-stack
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./ollama:/root/.ollama
ports:
- "11434:11434" # Add this line to expose the port
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
open-webui:
image: ghcr.io/open-webui/open-webui:latest
container_name: open-webui
restart: unless-stopped
networks:
- ai-stack
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- 'OLLAMA_BASE_URL=http://ollama:11434'
- ENABLE_RAG_WEB_SEARCH=True
- RAG_WEB_SEARCH_ENGINE=searxng
- RAG_WEB_SEARCH_RESULT_COUNT=3
- RAG_WEB_SEARCH_CONCURRENT_REQUESTS=10
- SEARXNG_QUERY_URL=http://searxng:8080/search?q=<query>
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./open-webui:/app/backend/data
depends_on:
- ollama
extra_hosts:
- host.docker.internal:host-gateway
ports:
- "8080:8080" # Add this line to expose the port
searxng:
image: searxng/searxng:latest
container_name: searxng
networks:
- ai-stack
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./searxng:/etc/searxng
depends_on:
- ollama
- open-webui
restart: unless-stopped
ports:
- "8081:8080" # Add this line to expose the port
stable-diffusion-download:
build: ./stable-diffusion-webui-docker/services/download/
image: comfy-download
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./stable-diffusion-webui-docker/data:/data
stable-diffusion-webui:
build: ./stable-diffusion-webui-docker/services/comfy/
image: comfy-ui
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- CLI_ARGS=
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./stable-diffusion-webui-docker/data:/data
- ./stable-diffusion-webui-docker/output:/output
stop_signal: SIGKILL
tty: true
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['0']
capabilities: [compute, utility]
restart: unless-stopped
networks:
- ai-stack
ports:
- "7860:7860" # Add this line to expose the port
mongo:
image: mongo
env_file:
- .env
networks:
- ai-stack
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/db_data:/data/db
- ./whisper/db_data/logs/:/var/log/mongodb/
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- MONGO_INITDB_ROOT_USERNAME=${DB_USER:-whishper}
- MONGO_INITDB_ROOT_PASSWORD=${DB_PASS:-whishper}
command: ['--logpath', '/var/log/mongodb/mongod.log']
ports:
- "27017:27017" # Add this line to expose the port
translate:
container_name: whisper-libretranslate
image: libretranslate/libretranslate:latest-cuda
env_file:
- .env
networks:
- ai-stack
restart: unless-stopped
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/libretranslate/data:/home/libretranslate/.local/share
- ./whisper/libretranslate/cache:/home/libretranslate/.local/cache
user: root
tty: true
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- LT_DISABLE_WEB_UI=True
- LT_LOAD_ONLY=${LT_LOAD_ONLY:-en,fr,es}
- LT_UPDATE_MODELS=True
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- "5000:5000" # Add this line to expose the port
whisper:
container_name: whisper
pull_policy: always
image: pluja/whishper:latest-gpu
env_file:
- .env
networks:
- ai-stack
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./whisper/uploads:/app/uploads
- ./whisper/logs:/var/log/whishper
- ./whisper/models:/app/models
restart: unless-stopped
depends_on:
- mongo
- translate
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- PUBLIC_INTERNAL_API_HOST=${WHISHPER_HOST}
- PUBLIC_TRANSLATION_API_HOST=${WHISHPER_HOST}
- PUBLIC_API_HOST=${WHISHPER_HOST:-}
- PUBLIC_WHISHPER_PROFILE=gpu
- WHISPER_MODELS_DIR=/app/models
- UPLOAD_DIR=/app/uploads
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- "8000:80" # Add this line to expose the port
networks:
ai-stack:
external: true
Before starting the stack, in the room ai-stack
folder, you'll want to clone the repo (or just copy the necessary files).
(this will create the folder for you)
git clone https://github.com/AbdBarho/stable-diffusion-webui-docker.git
After cloning, you'll want to make a change to the Docker file
nano stable-diffusion-webui-docker/services/comfy/Dockerfile
I commented out the pinning to commit hash and just grabbed the latest comfy.
FROM pytorch/pytorch:2.3.0-cuda12.1-cudnn8-runtime
ENV DEBIAN_FRONTEND=noninteractive PIP_PREFER_BINARY=1
RUN apt-get update && apt-get install -y git && apt-get clean
ENV ROOT=/stable-diffusion
RUN --mount=type=cache,target=/root/.cache/pip \
git clone https://github.com/comfyanonymous/ComfyUI.git ${ROOT} && \
cd ${ROOT} && \
git checkout master && \
# git reset --hard 276f8fce9f5a80b500947fb5745a4dde9e84622d && \
pip install -r requirements.txt
WORKDIR ${ROOT}
COPY . /docker/
RUN chmod u+x /docker/entrypoint.sh && cp /docker/extra_model_paths.yaml ${ROOT}
ENV NVIDIA_VISIBLE_DEVICES=all PYTHONPATH="${PYTHONPATH}:${PWD}" CLI_ARGS=""
EXPOSE 7860
ENTRYPOINT ["/docker/entrypoint.sh"]
CMD python -u main.py --listen --port 7860 ${CLI_ARGS}
If you cloned the repo and want to verify your changes, you can do so with:
git diff
You should see something like
diff --git a/services/comfy/Dockerfile b/services/comfy/Dockerfile
index 2de504d..a84c8ce 100644
--- a/services/comfy/Dockerfile
+++ b/services/comfy/Dockerfile
@@ -9,7 +9,7 @@ RUN --mount=type=cache,target=/root/.cache/pip \
git clone https://github.com/comfyanonymous/ComfyUI.git ${ROOT} && \
cd ${ROOT} && \
git checkout master && \
- git reset --hard 276f8fce9f5a80b500947fb5745a4dde9e84622d && \
+# git reset --hard 276f8fce9f5a80b500947fb5745a4dde9e84622d && \
pip install -r requirements.txt
WORKDIR ${ROOT}
(END)
You'll want to grab any models you like from HuggingFace. I am using stabilityai/stable-diffusion-3-medium
You'll want to download all of the models and then transfer them to your server and put them in the appropriate folders
Models will need to bt placed in the Stable-diffusion
folder.
stable-diffusion-webui-docker/data/models/Stable-diffusion
Models are any file in the root of stable-diffusion-3-medium
that have the extension *.safetensors
For clips, you'll need to create this folder (because it doesn't exist)
mkdir stable-diffusion-webui-docker/data/models/CLIPEncoder
In there you'll place your clips, from the text_encoders
folder from stable-diffusion-3-medium
You'll need to download the same workflows to the machine that accesses ComfyUI so you can import them into the browser.
Example workflows are also available on HuggingFace in the Stable Diffusion 3 Medium repo
If you're going to spend all of that time downloading these model files, you should also pend a few minutes verifying them. I typically do this once they are on the server running the AI Stack
shasum -a 256 ./sd3_medium.safetensors
This should output something like:
cc236278d28c8c3eccb8e21ee0a67ebed7dd6e9ce40aa9de914fa34e8282f191 ./sd3_medium.safetensors
You'll want to be sure the checksum matches the checksum from the source (HuggingFace, etc).
Please see folder structure above
Before running this, you will need to create the network for Docker to use.
This might already exist if you are using traefik. If so skip this step.
docker network create traefik
This will create the macvlan
network. Adjust accordingly.
docker network create -d macvlan \
--subnet=192.168.20.0/24 \
--gateway=192.168.20.1 \
-o parent=eth1 \
iot_macvlan
---
services:
homeassistant:
container_name: homeassistant
networks:
iot_macvlan:
ipv4_address: 192.168.20.202 #optional, I am using mac vlan, if you don't want to, remove iot_macvlan and don't create the network above
traefik:
image: ghcr.io/home-assistant/home-assistant:stable
depends_on:
- faster-whisper-gpu
- wyoming-piper
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./home-assistant/config:/config
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
restart: unless-stopped
labels:
- "traefik.enable=true"
- "traefik.http.routers.homeassistant.rule=Host(`homeassistant.local.example.com`)"
- "traefik.http.routers.homeassistant.entrypoints=https"
- "traefik.http.routers.homeassistant.tls=true"
- "traefik.http.routers.homeassistant.tls.certresolver=cloudflare"
- "traefik.http.routers.homeassistant.middlewares=default-headers@file"
- "traefik.http.services.homeassistant.loadbalancer.server.port=8123"
faster-whisper-gpu:
image: lscr.io/linuxserver/faster-whisper:gpu
container_name: faster-whisper-gpu
networks:
- traefik
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
- WHISPER_MODEL=tiny-int8
- WHISPER_BEAM=1 #optional
- WHISPER_LANG=en #optional
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./faster-whisper/data:/config
restart: unless-stopped
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
wyoming-piper:
container_name: wyoming-piper
networks:
- traefik
image: rhasspy/wyoming-piper # no gpu
volumes:
- /etc/localtime:/etc/localtime:ro
- /etc/timezone:/etc/timezone:ro
- ./wyoming-piper/data:/data
environment:
- PUID=${PUID:-1000}
- PGID=${PGID:-1000}
restart: unless-stopped
command: --voice en_US-lessac-medium
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
networks:
traefik:
external: true
iot_macvlan:
external: true
I am using Basic Auth Middleware with traefik. Please see traefik section for details on how to set this up.
I am using Continue for code completion and integrated chat.
Example config.
If you aren't going to use auth, remove the requestOptions
key.
If you are going to use auth, please replace the xxx
with the value from above
{
"models": [
{
"title": "Ollama (Self-Hosted)",
"provider": "ollama",
"model": "AUTODETECT",
"completionOptions": {},
"apiBase": "https://ollama.local.example.com",
"requestOptions": {
"headers": {
"Authorization": "Basic xxx"
}
}
}
],
"tabAutocompleteModel": {
"title": "Starcoder 3b",
"provider": "ollama",
"model": "starcoder2:3b",
"apiBase": "https://ollama.local.example.com",
"requestOptions": {
"headers": {
"Authorization": "Basic xxx"
}
}
}
}
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>You know that video on self-hosted AI that I just released?
— Techno Tim (@TechnoTimLive) July 8, 2024
Well I just followed up with one of the most in-depth tutorials I have ever released.
FULL TUTORIAL HERE👇https://t.co/5JAo9Phd1Y pic.twitter.com/zdqgd0IgS8
🛍️ Check out the new Merch Shop at https://l.technotim.live/shop
⚙️ See all the hardware I recommend at https://l.technotim.live/gear
🚀 Don't forget to check out the 🚀Launchpad repo with all of the quick start source files
🤝 Support me and help keep this site ad-free!