Skip to content

martintsan/X-Image

Repository files navigation

X-Image

Self-hosted AI image generation.

What is X-Image?

X-Image is a self-hosted image generation application — think WordPress, but for AI images. It runs on your own hardware with a single NVIDIA GPU and provides a web UI, REST API, and CLI for generating images.

Under the hood, X-Image uses the Z-Image-Turbo model (6B parameters) by Tongyi-MAI, quantized to GGUF format for efficient inference via stable-diffusion.cpp.

Features

  • Text-to-image — generate images from natural language prompts
  • Image-to-image — transform existing images with a text prompt
  • Inpainting — edit specific regions of an image using a mask
  • Bilingual prompts — supports both English and Chinese (powered by Qwen3 text encoder)
  • Web UI — clean Next.js interface with gallery, lightbox, and progress indicator
  • REST API — A1111-compatible endpoints for integration with existing tools
  • CLI — single-command generation via generate.sh
  • Docker — one-command deployment with GPU passthrough

System Requirements

Resource Minimum
GPU NVIDIA with 6 GB VRAM (Q6_K) or 4 GB (Q4_K)
RAM 12 GB
Disk 15 GB free
Driver NVIDIA 525+ (CUDA 12.x)
OS Linux (tested on Ubuntu 22.04+)

Quick Start

One command to install everything — builds from source, downloads models (~8.6 GB), and starts X-Image as systemd services:

curl -fsSL https://get.x-image.app | bash

The installer will prompt you to choose between bare-metal (systemd services) or Docker (Compose with GPU passthrough). You can also pass a flag to skip the prompt:

bash install.sh --bare-metal   # build from source, systemd services
bash install.sh --docker       # Docker Compose with GPU passthrough

Once complete, the web UI is at http://localhost:3100.

Manual Installation

If you prefer to set things up yourself:

1. Build the inference engine

cd stable-diffusion.cpp
mkdir build && cd build
cmake .. -DSD_CUDA=ON
make -j$(nproc)

2. Install Python dependencies

python -m venv .venv
.venv/bin/pip install -r requirements.txt

3. Download models

Download the model files (see Models below) into the models/ directory.

4. Start the backend

.venv/bin/python -m uvicorn server.app:app --host 0.0.0.0 --port 8100

5. Start the frontend

cd web
npm install
npm run dev

Docker (manual)

# Place model files in models/ directory, then:
docker compose up --build

Models

Download the following models and place them in the models/ directory:

Component File Format
Diffusion (DiT) models/z_image_turbo-Q6_K.gguf GGUF 6-bit
VAE models/vae/split_files/vae/ae.safetensors safetensors
Text Encoder models/text_encoder/Qwen3-4B-Instruct-2507-Q4_K_M.gguf GGUF 4-bit

The diffusion model is quantized from Tongyi-MAI/Z-Image-Turbo on HuggingFace.

Architecture

Browser (Next.js :3100) → FastAPI (:8100) → sd-server C++ (:7860) → DiT model
Component Description
sd-server C++ inference engine (stable-diffusion.cpp), runs GGUF-quantized models
FastAPI Python API layer, manages sd-server lifecycle, proxies generation requests
Next.js Web frontend with generation form, image gallery, and lightbox

Configuration

All settings are configured via environment variables (see server/config.py):

Variable Default Description
SD_SERVER_PORT 7860 Internal C++ server port
FASTAPI_PORT 8100 Public API port
DEFAULT_WIDTH 1024 Default image width
DEFAULT_HEIGHT 1024 Default image height
DEFAULT_STEPS 8 Default denoising steps
DEFAULT_CFG_SCALE 1.0 Default CFG scale
DIFFUSION_MODEL z_image_turbo-Q6_K.gguf Diffusion model filename
VAE_MODEL vae/split_files/vae/ae.safetensors VAE model path
LLM_MODEL text_encoder/Qwen3-4B-Instruct-2507-Q4_K_M.gguf Text encoder path

Usage

Web UI

Open http://localhost:3100 in your browser. The interface provides tabs for text-to-image, image-to-image, and inpainting modes.

CLI

# Text-to-image
./generate.sh -p "A cat on the moon"

# Image-to-image
./generate.sh -p "oil painting style" -i photo.png -s 0.6

# Custom dimensions
./generate.sh -p "A landscape" -o landscape.png -W 1536 -H 768

REST API

Method Endpoint Content-Type Description
POST /api/v1/txt2img application/json Text-to-image
POST /api/v1/img2img multipart/form-data Image-to-image
POST /api/v1/inpaint multipart/form-data Inpainting
GET /api/v1/health Health check
GET /api/v1/samplers List samplers
GET /api/v1/schedulers List schedulers

Local Deployment (systemd)

The installer sets up systemd user services automatically. To manage them:

# Start / stop both services
systemctl --user start ximage.target
systemctl --user stop ximage.target

# Check status
systemctl --user status ximage-backend ximage-frontend

# Follow logs
journalctl --user -u ximage-backend -f
Service Port
Next.js frontend 3100
FastAPI backend 8100
sd-server (internal) 7860

License

Apache 2.0 — see LICENSE.

Built on Z-Image-Turbo by Tongyi-MAI, licensed under Apache 2.0.


x-image.app

About

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors