Skip to content

Docker image available: s2-tts with Q8_0 GGUF for 12GB VRAM #22

@orrinwitt

Description

@orrinwitt

Hey — I built a Docker image that packages s2.cpp with the Q8_0 GGUF model for easy deployment on GPUs with 12GB VRAM.

Image: ghcr.io/orrinwitt/s2-tts:latest
Repo: https://github.com/orrinwitt/s2-tts

Features:

  • Q8_0 model and tokenizer baked into the image (no external downloads)
  • CUDA support via NVIDIA Container Toolkit
  • HTTP server mode on port 3030 (/generate endpoint)
  • Voice cloning via multipart form data
  • S2-Pro [bracket] emotion tag syntax supported
  • Runs on RTX 3060 12GB with room to spare

Quick start:

services:
  s2-tts:
    image: ghcr.io/orrinwitt/s2-tts:latest
    ports:
      - "3030:3030"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['1']
              capabilities: [gpu]

API-only (no WebUI) — designed for programmatic use. See the README for full docs including emotion tags, voice cloning, and text formatting rules.

Thanks for the great work on s2.cpp!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions