Multi-modal AI Generation Gateway Platform
A unified gateway that aggregates multiple AI providers and models — text, image, and video generation — behind a single API with credit-based billing, rate limiting, and a full admin console.
Zephyr is a multi-tenant, multi-provider, multi-modal AIGC (AI-Generated Content) platform. It acts as a smart gateway between end-users and upstream AI providers (e.g. Volcengine, OpenAI, Runway), providing:
- Unified API access across text, image, and video generation models
- Capability-centric model catalog — models declare their abilities and parameter schemas; the UI renders dynamically
- Credit-based billing with per-model, per-capability pricing (per-call, per-token, per-second)
- API Key pool management — weighted round-robin dispatch across multiple provider keys with QPS / RPM / concurrency limits
- Server-rendered admin & client dashboards built with Jinja2 + HTMX
┌──────────────┐ ┌──────────────────────────────────────────────────┐
│ Browser │────▶│ FastAPI (uvicorn) │
│ (HTMX + │◀────│ ├── Auth (JWT + cookie) │
│ WebSocket) │ │ ├── Client Portal (model market, playground) │
└──────────────┘ │ ├── Admin Console (providers, models, users) │
│ ├── Generation API (REST + WebSocket progress) │
│ ├── Assets API (upload / download) │
│ └── Middleware (CORS, CSRF, Rate-limit) │
└──────────┬───────────┬──────────────────────────┘
│ │
┌──────────▼──┐ ┌─────▼──────┐
│ PostgreSQL │ │ Redis │
│ (async via │ │ (cache, │
│ asyncpg) │ │ rate-limit,│
└─────────────┘ │ broker) │
└──────┬──────┘
│
┌───────────────────────▼──────────────────────┐
│ Celery Worker (solo pool) │
│ ├── Image generation tasks │
│ ├── Video generation tasks (async polling) │
│ └── Text generation (streaming via WS) │
└──────────┬──────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
┌───────────┐ ┌──────────────┐ ┌────────────┐
│ Volcengine │ │ Volcengine │ │ MinIO │
│ (LLM) │ │ (Image/Video)│ │ (Object │
└───────────┘ └──────────────┘ │ Storage) │
└────────────┘
| Layer | Technology |
|---|---|
| Runtime | Python 3.12 |
| API Framework | FastAPI + Uvicorn |
| Async ORM | SQLAlchemy 2.0 (asyncpg driver) |
| Migrations | Alembic |
| Task Queue | Celery 5 + Redis broker |
| Task Monitor | Flower |
| Database | PostgreSQL 16 |
| Cache / Rate-Limit | Redis 7 |
| Object Storage | MinIO |
| Templating | Jinja2 + HTMX |
| Auth | JWT (PyJWT) + bcrypt |
| Encryption | Fernet (cryptography) — for API key storage |
| Logging | structlog |
| Linting | Ruff |
| Type Checking | mypy |
Zephyr/
├── src/
│ ├── main.py # FastAPI app, routers, middleware, lifespan
│ ├── core/
│ │ ├── config.py # Pydantic Settings (env-backed)
│ │ ├── database.py # AsyncSession factory
│ │ ├── redis.py # Redis connection pool
│ │ ├── celery_app.py # Celery instance
│ │ ├── security.py # JWT encode/decode, password hashing
│ │ ├── models.py # SQLAlchemy base model
│ │ ├── all_models.py # Imports all models for mapper resolution
│ │ ├── logging.py # structlog configuration
│ │ └── middleware/
│ │ ├── auth.py # Cookie/Bearer auth dependency
│ │ ├── csrf.py # CSRF token generation & validation
│ │ └── rate_limiter.py # Redis-backed sliding-window rate limiter
│ ├── modules/
│ │ ├── auth/ # Login, register, JWT issuance
│ │ ├── client/ # Server-rendered user portal (market, playground, history)
│ │ ├── admin/ # Admin console (providers, models, keys, users, billing)
│ │ ├── generation/ # Generation API, Celery worker, WebSocket progress
│ │ ├── billing/ # Credit ledger service, transactions
│ │ ├── assets/ # File upload / download (MinIO-backed)
│ │ └── users/ # User ORM models
│ ├── integrations/
│ │ └── providers/
│ │ ├── base.py # Abstract provider interface
│ │ ├── factory.py # Provider instantiation by name
│ │ ├── key_manager.py # Weighted key selection + concurrency tracking
│ │ ├── models.py # Provider & Model ORM
│ │ ├── volcengine_llm.py
│ │ ├── volcengine_image.py
│ │ └── volcengine_video.py
│ ├── services/
│ │ └── rate_limit_service.py
│ ├── utils/
│ │ ├── crypto.py # Fernet encrypt/decrypt helpers
│ │ ├── pagination.py # Offset-limit pagination helper
│ │ ├── pricing.py # Capability-aware credit cost calculator
│ │ ├── storage.py # MinIO client wrapper
│ │ └── validators.py # Reusable Pydantic validators
│ └── static/
│ └── sw.js # Service worker
├── alembic/ # Database migrations
├── tests/
│ ├── conftest.py # Async fixtures (in-memory SQLite)
│ ├── test_generation.py
│ ├── test_billing.py
│ └── test_assets.py
├── scripts/ # Operational scripts (seed data, migrations)
├── docs/
│ └── PRD.md # Full product requirements document (Chinese)
├── docker-compose.yml # Full-stack local environment
├── Dockerfile
├── pyproject.toml # Project metadata, tool config (ruff, mypy, pytest)
├── requirements.txt
└── .env.example # Environment variable template
- Python 3.12+
- Docker & Docker Compose (for infrastructure services)
- uv or pip (for Python dependency management)
git clone <repository-url> && cd Zephyr
cp .env.example .env
# Edit .env — fill in JWT_SECRET, ENCRYPTION_KEY, and storage credentialsThe compose file references external named volumes. Create them once:
docker volume create aigc_pgdata
docker volume create aigc_redisdata
docker volume create aigc_miniodatadocker compose up -d postgres redis miniopython -m venv .venv && source .venv/bin/activate
pip install -r requirements.txtalembic upgrade headpython scripts/create_admin.pypython scripts/seed_volcengine.py# Terminal 1 — API server
uvicorn src.main:app --reload --port 8000
# Terminal 2 — Celery worker
celery -A src.core.celery_app worker --pool=solo --loglevel=info
# Terminal 3 — Flower (optional, task monitoring)
celery -A src.core.celery_app flower --port=5555Or bring up everything with Docker Compose:
docker compose up -d| Service | URL |
|---|---|
| Client Portal | http://localhost:8000/market |
| Admin Console | http://localhost:8000/admin |
| API Docs (Swagger) | http://localhost:8000/docs |
| Flower (Task Monitor) | http://localhost:5555 |
| MinIO Console | http://localhost:9001 |
Models are not simple "image" or "text" types. Each model declares a list of capabilities — a single model can support both text_to_video and image_to_video. Each capability carries:
param_schema— declarative field definitions the UI renders as dynamic formsui_hints— layout strategy (simple,studio,chat), result display type, example promptspricing— per-call, per-token, or per-second billing with conditional modifiers
- Pre-deduction — credits are estimated and held before generation starts
- Settlement — actual cost is calculated on completion; difference is refunded or charged
- Full refund on failure
- Transaction ledger — every movement (recharge, consume, refund, adjustment) is recorded
Provider API keys are managed in a pool with:
- Weighted round-robin selection
- Per-key QPS / RPM / concurrency limits enforced via Redis
- Health tracking — error counts, auto-disable on repeated failures
- Encrypted storage — keys are Fernet-encrypted at rest
ruff check src/
ruff format src/mypy src/pytestTests use an async in-memory SQLite backend via aiosqlite, configured through the conftest.py fixtures.
See .env.example for the full list. Key variables:
| Variable | Description |
|---|---|
DATABASE_URL |
PostgreSQL async connection string |
REDIS_URL |
Redis URL for caching and rate limiting |
JWT_SECRET |
Secret for signing JWT tokens |
ENCRYPTION_KEY |
32-byte key for Fernet API key encryption |
CELERY_BROKER_URL |
Redis URL for Celery task broker |
MINIO_ENDPOINT |
MinIO server address |
MINIO_ACCESS_KEY / MINIO_SECRET_KEY |
MinIO credentials |