Skip to content

Algovate/Zephyr

Repository files navigation

⚡ Zephyr

Multi-modal AI Generation Gateway Platform

Python 3.12+ FastAPI Celery PostgreSQL Redis License


A unified gateway that aggregates multiple AI providers and models — text, image, and video generation — behind a single API with credit-based billing, rate limiting, and a full admin console.


Overview

Zephyr is a multi-tenant, multi-provider, multi-modal AIGC (AI-Generated Content) platform. It acts as a smart gateway between end-users and upstream AI providers (e.g. Volcengine, OpenAI, Runway), providing:

  • Unified API access across text, image, and video generation models
  • Capability-centric model catalog — models declare their abilities and parameter schemas; the UI renders dynamically
  • Credit-based billing with per-model, per-capability pricing (per-call, per-token, per-second)
  • API Key pool management — weighted round-robin dispatch across multiple provider keys with QPS / RPM / concurrency limits
  • Server-rendered admin & client dashboards built with Jinja2 + HTMX

Architecture

┌──────────────┐     ┌──────────────────────────────────────────────────┐
│   Browser    │────▶│  FastAPI (uvicorn)                               │
│  (HTMX +     │◀────│  ├── Auth (JWT + cookie)                        │
│   WebSocket) │     │  ├── Client Portal (model market, playground)   │
└──────────────┘     │  ├── Admin Console (providers, models, users)   │
                     │  ├── Generation API (REST + WebSocket progress) │
                     │  ├── Assets API (upload / download)             │
                     │  └── Middleware (CORS, CSRF, Rate-limit)        │
                     └──────────┬───────────┬──────────────────────────┘
                                │           │
                     ┌──────────▼──┐  ┌─────▼──────┐
                     │ PostgreSQL  │  │   Redis     │
                     │ (async via  │  │ (cache,     │
                     │  asyncpg)   │  │  rate-limit,│
                     └─────────────┘  │  broker)    │
                                      └──────┬──────┘
                                             │
                     ┌───────────────────────▼──────────────────────┐
                     │         Celery Worker (solo pool)            │
                     │  ├── Image generation tasks                  │
                     │  ├── Video generation tasks (async polling)  │
                     │  └── Text generation (streaming via WS)      │
                     └──────────┬──────────────────────────────────┘
                                │
              ┌─────────────────┼─────────────────┐
              ▼                 ▼                  ▼
        ┌───────────┐   ┌──────────────┐   ┌────────────┐
        │ Volcengine │   │ Volcengine   │   │ MinIO      │
        │   (LLM)   │   │ (Image/Video)│   │ (Object    │
        └───────────┘   └──────────────┘   │  Storage)  │
                                           └────────────┘

Tech Stack

Layer Technology
Runtime Python 3.12
API Framework FastAPI + Uvicorn
Async ORM SQLAlchemy 2.0 (asyncpg driver)
Migrations Alembic
Task Queue Celery 5 + Redis broker
Task Monitor Flower
Database PostgreSQL 16
Cache / Rate-Limit Redis 7
Object Storage MinIO
Templating Jinja2 + HTMX
Auth JWT (PyJWT) + bcrypt
Encryption Fernet (cryptography) — for API key storage
Logging structlog
Linting Ruff
Type Checking mypy

Project Structure

Zephyr/
├── src/
│   ├── main.py                  # FastAPI app, routers, middleware, lifespan
│   ├── core/
│   │   ├── config.py            # Pydantic Settings (env-backed)
│   │   ├── database.py          # AsyncSession factory
│   │   ├── redis.py             # Redis connection pool
│   │   ├── celery_app.py        # Celery instance
│   │   ├── security.py          # JWT encode/decode, password hashing
│   │   ├── models.py            # SQLAlchemy base model
│   │   ├── all_models.py        # Imports all models for mapper resolution
│   │   ├── logging.py           # structlog configuration
│   │   └── middleware/
│   │       ├── auth.py          # Cookie/Bearer auth dependency
│   │       ├── csrf.py          # CSRF token generation & validation
│   │       └── rate_limiter.py  # Redis-backed sliding-window rate limiter
│   ├── modules/
│   │   ├── auth/                # Login, register, JWT issuance
│   │   ├── client/              # Server-rendered user portal (market, playground, history)
│   │   ├── admin/               # Admin console (providers, models, keys, users, billing)
│   │   ├── generation/          # Generation API, Celery worker, WebSocket progress
│   │   ├── billing/             # Credit ledger service, transactions
│   │   ├── assets/              # File upload / download (MinIO-backed)
│   │   └── users/               # User ORM models
│   ├── integrations/
│   │   └── providers/
│   │       ├── base.py          # Abstract provider interface
│   │       ├── factory.py       # Provider instantiation by name
│   │       ├── key_manager.py   # Weighted key selection + concurrency tracking
│   │       ├── models.py        # Provider & Model ORM
│   │       ├── volcengine_llm.py
│   │       ├── volcengine_image.py
│   │       └── volcengine_video.py
│   ├── services/
│   │   └── rate_limit_service.py
│   ├── utils/
│   │   ├── crypto.py            # Fernet encrypt/decrypt helpers
│   │   ├── pagination.py        # Offset-limit pagination helper
│   │   ├── pricing.py           # Capability-aware credit cost calculator
│   │   ├── storage.py           # MinIO client wrapper
│   │   └── validators.py        # Reusable Pydantic validators
│   └── static/
│       └── sw.js                # Service worker
├── alembic/                     # Database migrations
├── tests/
│   ├── conftest.py              # Async fixtures (in-memory SQLite)
│   ├── test_generation.py
│   ├── test_billing.py
│   └── test_assets.py
├── scripts/                     # Operational scripts (seed data, migrations)
├── docs/
│   └── PRD.md                   # Full product requirements document (Chinese)
├── docker-compose.yml           # Full-stack local environment
├── Dockerfile
├── pyproject.toml               # Project metadata, tool config (ruff, mypy, pytest)
├── requirements.txt
└── .env.example                 # Environment variable template

Quick Start

Prerequisites

  • Python 3.12+
  • Docker & Docker Compose (for infrastructure services)
  • uv or pip (for Python dependency management)

1. Clone & Configure

git clone <repository-url> && cd Zephyr
cp .env.example .env
# Edit .env — fill in JWT_SECRET, ENCRYPTION_KEY, and storage credentials

2. Create Docker Volumes

The compose file references external named volumes. Create them once:

docker volume create aigc_pgdata
docker volume create aigc_redisdata
docker volume create aigc_miniodata

3. Start Infrastructure

docker compose up -d postgres redis minio

4. Install Dependencies

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

5. Run Migrations

alembic upgrade head

6. Create Admin User

python scripts/create_admin.py

7. Seed Provider Data (Optional)

python scripts/seed_volcengine.py

8. Start the Application

# Terminal 1 — API server
uvicorn src.main:app --reload --port 8000

# Terminal 2 — Celery worker
celery -A src.core.celery_app worker --pool=solo --loglevel=info

# Terminal 3 — Flower (optional, task monitoring)
celery -A src.core.celery_app flower --port=5555

Or bring up everything with Docker Compose:

docker compose up -d

9. Open in Browser

Service URL
Client Portal http://localhost:8000/market
Admin Console http://localhost:8000/admin
API Docs (Swagger) http://localhost:8000/docs
Flower (Task Monitor) http://localhost:5555
MinIO Console http://localhost:9001

Key Concepts

Capability-Centric Models

Models are not simple "image" or "text" types. Each model declares a list of capabilities — a single model can support both text_to_video and image_to_video. Each capability carries:

  • param_schema — declarative field definitions the UI renders as dynamic forms
  • ui_hints — layout strategy (simple, studio, chat), result display type, example prompts
  • pricing — per-call, per-token, or per-second billing with conditional modifiers

Credit Billing

  1. Pre-deduction — credits are estimated and held before generation starts
  2. Settlement — actual cost is calculated on completion; difference is refunded or charged
  3. Full refund on failure
  4. Transaction ledger — every movement (recharge, consume, refund, adjustment) is recorded

API Key Pool

Provider API keys are managed in a pool with:

  • Weighted round-robin selection
  • Per-key QPS / RPM / concurrency limits enforced via Redis
  • Health tracking — error counts, auto-disable on repeated failures
  • Encrypted storage — keys are Fernet-encrypted at rest

Development

Linting & Formatting

ruff check src/
ruff format src/

Type Checking

mypy src/

Testing

pytest

Tests use an async in-memory SQLite backend via aiosqlite, configured through the conftest.py fixtures.

Environment Variables

See .env.example for the full list. Key variables:

Variable Description
DATABASE_URL PostgreSQL async connection string
REDIS_URL Redis URL for caching and rate limiting
JWT_SECRET Secret for signing JWT tokens
ENCRYPTION_KEY 32-byte key for Fernet API key encryption
CELERY_BROKER_URL Redis URL for Celery task broker
MINIO_ENDPOINT MinIO server address
MINIO_ACCESS_KEY / MINIO_SECRET_KEY MinIO credentials

About

Multi-modal AI Generation Gateway Platform

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages