AI-powered creative content studio — turn ideas into polished slides and concept matrices.
Lucid is a containerized web application with two first-class creation modes: a 6-stage carousel pipeline that converts rough drafts into pixel-perfect social media slides, and a Concept Matrix Generator that maps any theme or cross-axis relationship into a streamed n×n visual grid. Built with a FastAPI backend and a React/TypeScript frontend, it uses Google Gemini for creative heavy lifting and a deterministic rendering engine for reliable typography.
Why a pipeline instead of a single prompt? Because Generative AI struggles with rendering text inside images. Asking one model to generate a complete, correctly-spelled, well-laid-out slide in a single shot reliably produces hallucinations and warped artifacts.
Lucid solves this with a Progressive Pipeline Architecture — AI as a chain of specialized workers, not a magic black box:
- Separation of Concerns: The AI handles structural writing and generates text-free background imagery. A deterministic engine (Pillow/PIL) handles all typography.
- Grounding & Context: Each stage's output grounds the next. Search-grounded research feeds a structured draft → Draft is split into slide text → Text informs global visual styles → Styles dictate slide-specific image prompts → Prompts generate clean backgrounds → Text is accurately layered on top.
- Human-in-the-Loop: Six discrete stages let you audit, edit, and course-correct at any step without starting over.
Lucid uses a strict, bi-directional 6-stage workflow. You can regenerate individual slides, edit any output, or step backward without losing downstream configuration.
Chat with an AI research assistant grounded in real-time Google Search to brainstorm ideas, gather facts, and explore angles. When ready, click Create Draft to synthesise the conversation into a structured draft — or skip if you already have content.
Paste or refine your draft text. The AI acts as a content strategist, splitting it into a logically flowing sequence of slides with distinct Titles and Bodies.
Customise language, slide count, and toggle title generation.
The AI analyses your slide text and generates distinct style proposals (e.g. Minimalist geometric, warm watercolour washes, dark moody photography). Picking one here guarantees that all images generated later share a consistent visual vocabulary.
Instead of generating images directly, the AI first writes highly specific, text-free image prompts for each individual slide, prepending the global style chosen in Stage 3.
Edit or regenerate individual prompts before spending API calls on images.
Lucid sends all image prompts in parallel to Gemini and renders clean, consistent 4:5 background images (1080×1350 px).
The deterministic rendering engine takes over. Drag, drop, and resize text boxes on the canvas. Adjust fonts, weights, colours, drop shadows, and strokes. The system binary-searches for the mathematically perfect font size to fit your text. Hit Export ZIP to download your ready-to-post carousel.
The Concept Matrix Generator creates an n×n grid of AI-generated cells. Each cell represents the intersection of two descriptive axes and carries its own title, body text, and optional background image. Cells stream to the UI as they complete — you see the grid fill in live.
Matrices are fully independent of the carousel pipeline. They have their own home screen, project list, settings, and views.
When creating a matrix you choose one of two modes:
| Mode | What you provide | What the AI does |
|---|---|---|
| Theme | A theme string (e.g. "The philosophy of time and consciousness") | Picks n distinct diagonal concepts from the theme; invents a unique pair of descriptive axes for each concept; then populates all n×n cells at the intersection of those axes. Grid is square (n×n). |
| Description | A cross-axis relationship (e.g. "feels like a certain generation but is actually from a different one") | A single LLM call derives both axis labels and n shared labels for rows and columns, producing a rectangular grid (rows×cols). |
- Theme mode: choose n ∈ {2, 3, 4, 5, 6} → produces an n×n grid (4–36 cells).
- Description mode: choose rows and columns independently from {2, 3, 4, 5, 6}.
Each matrix is generated with one of four style modes that shape the tone of every cell:
| Style | Character |
|---|---|
| Neutral | Balanced, informative |
| Fun | Playful, informal |
| Absurd | Unexpected, surreal juxtapositions |
| Academic | Analytical, precise |
Clicking Generate Matrix starts an async generation job on the backend. Cells are generated in parallel (bounded by max_concurrency) and streamed to the UI via SSE. You can watch the grid populate cell by cell without waiting for the full job to finish.
A generation job has four statuses: Pending → Generating → Complete / Failed.
Images are optional and can be added at any time:
- At creation time: check Generate images for each cell in the New Matrix modal.
- After generation: click Generate Images from the matrix view to bulk-generate backgrounds for all cells.
- Per cell: click Regen on any individual cell to regenerate its image without touching others.
Images are produced by Gemini from an AI-written prompt that combines the cell's content with the matrix's global style.
If some cells fail during generation (or produce low-quality content), click Revalidate from the matrix view. This runs a validation pass that identifies failed or weak cells and regenerates only those. You can optionally supply a user comment (e.g. "cells should be more specific") which is injected into the validator prompt as extra instructions.
Progress streams via the same SSE endpoint as initial generation.
Once a matrix is generated, three views are available:
| View | Description |
|---|---|
| Grid | Default interactive view. Shows all cells in the n×n grid with axis labels. Click any cell to expand it. |
| Poster | Full-page visual layout designed for screenshots or sharing — renders all cells in a clean grid with large axis headers. |
| Reveal | Presentation mode: cycles through cells one at a time for storytelling or slide-by-slide review. |
Click Settings in the Matrix workspace to configure:
| Setting | Description |
|---|---|
| Text Model | Gemini model ID used for cell text generation |
| Image Model | Gemini model ID used for cell image generation |
| Diagonal Temperature | LLM temperature for seed concept generation (higher = more surprising) |
| Axes Temperature | Temperature for descriptor/axis generation |
| Cell Temperature | Temperature for off-diagonal cell content |
| Validation Temperature | Temperature for the revalidation pass |
| Max Concurrency | Number of parallel LLM calls during generation (1–20) |
| Max Retries | Automatic retry attempts for failed cells (0–5) |
Settings are persisted in matrix_settings.json and can be reset to factory defaults.
- Binary Search Text Fitting: The
RenderingServiceuses an O(log n) binary search algorithm to find the mathematically perfect font size for a given bounding box — text never overflows or runs short. - Fuzzy Font Matching: The
FontManagerindexes all TTF/OTF files on startup. If the UI requests "Inter 600" but only "Inter 700" is available, it gracefully uses the nearest weight. - Debounced Sync & Render Loop:
useDebouncedRendereliminates React state race conditions. Rapid keystrokes trigger a single synchronous database save followed by a deterministic PIL render, using monotonic request IDs to drop stale in-flight responses. - SSE Streaming: Both slide text regeneration and matrix cell generation use Server-Sent Events. The
useStreamingTextanduseMatrixStreamhooks share a commonparseSSELineutility. - Hot-Swappable, Validated Prompts: LLM instructions live in version-controlled
.promptfiles editable via the UI. The backend runs AST-style validation to ensure required{variables}aren't accidentally deleted. - Concurrency Control:
bounded_gather()— a thinasyncio.Semaphorewrapper — limits concurrent LLM calls for both the carousel pipeline and matrix generator, preventing 429 errors. - Per-Template Prompt Overrides: Root
.promptfiles are shared defaults. Files inprompts/carousel/orprompts/painting/override only the templates that need different behaviour.
| Key | Description |
|---|---|
global_defaults |
Default language, slide count, title inclusion |
image |
Output resolution (default: 1080×1350, aspect ratio 4:5) |
style |
Base typography: font family, colours, stroke properties |
stage_instructions |
Pre-seeded instructions for specific pipeline stages |
Any project configuration can be saved as a reusable Template. Future projects launched from a template inherit its slide count, configuration defaults, and prompt logic. Templates are managed via the Templates button on the carousel home screen.
Matrix generation parameters (models, temperatures, concurrency, retries) are configured through the Settings button in the Matrix workspace and persisted in matrix_settings.json.
The fastest way to run Lucid with zero local setup:
- Set your API key as a Codespaces secret — go to github.com/settings/codespaces, add a secret named
GOOGLE_API_KEYwith your key from aistudio.google.com/apikey, and grant it access to this repository. - Open the Codespace — click the badge above or go to codespaces.new/FloLey/Lucid.
- The app builds and launches automatically. The browser tab for port 5173 opens when ready.
The
GOOGLE_API_KEYsecret is injected into the Docker Compose environment automatically — no.envfile needed.
Lucid is built for containerized, Docker-first development.
Prerequisites:
- Docker & Docker Compose
- A Google Generative AI API Key (https://aistudio.google.com/apikey)
git clone https://github.com/FloLey/Lucid.git
cd Lucid
cp .env.example .envAdd your GOOGLE_API_KEY to the .env file. (Without an API key the app falls back to gradient placeholder images and text generation is unavailable.)
echo '{}' > config.jsondocker-compose up --buildDocker downloads required open-source fonts during the build.
- App:
http://localhost:5173 - API docs:
http://localhost:8000/docs
Run Lucid on a VPS with the frontend and API reachable only from devices on your Tailscale network.
Instead of binding ports to 0.0.0.0, we bind exclusively to the Tailscale IP (100.x.y.z). Docker's iptables rules only forward traffic that arrives on that interface — no public exposure, no UFW rules needed for app ports.
- Docker & Docker Compose on the VPS
- Tailscale installed and connected (
tailscale up) - Tailscale on any client device you want to access from
git clone https://github.com/FloLey/Lucid.git
cd Lucid
cp .env.example .env
# Edit .env and set GOOGLE_API_KEY
echo '{}' > config.json./scripts/prod_up.shThe script detects your Tailscale IPv4 address, binds the frontend (port 5173) and backend (port 8000) to it, builds, and starts in detached mode.
Tailscale IP detected: 100.64.0.12
Starting Lucid (production mode) on 100.64.0.12 ...
Lucid is running (Tailscale-private access only):
Frontend : http://100.64.0.12:5173
API docs : http://100.64.0.12:8000/docs
./scripts/prod_down.shThe lucid-data volume (database + generated images) is preserved. To also remove the data volume: docker compose down -v.
sudo ufw default deny incoming
sudo ufw allow ssh
sudo ufw allow in on tailscale0
sudo ufw enableexport LUCID_BIND_IP="$(tailscale ip -4 | head -n1)"
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d --buildAll routes are under the /api prefix. The full interactive docs are at http://localhost:8000/docs.
| Route | Method | Description |
|---|---|---|
/api/projects/ |
POST |
Create a new project (optionally from a template) |
/api/projects/{id}/reorder |
POST |
Reorder slides within a project |
/api/stage-research/chat |
POST |
Send a message to the search-grounded research assistant |
/api/stage-research/extract-draft |
POST |
Synthesise the research conversation into a draft |
/api/stage-draft/generate |
POST |
Convert draft text → slide array |
/api/stage-draft/regenerate-stream |
POST |
Regenerate a single slide's text via SSE |
/api/stage-style/generate |
POST |
Generate visual style proposals |
/api/stage-prompts/generate |
POST |
Generate per-slide image prompts |
/api/stage-images/generate |
POST |
Generate background images from prompts |
/api/stage-typography/apply-all |
POST |
Composite text over backgrounds via PIL |
/api/export/zip |
POST |
Download project as a ZIP archive |
/api/prompts/validate |
POST |
Validate .prompt template edits without saving |
| Route | Method | Description |
|---|---|---|
/api/matrix/ |
POST |
Start a new matrix generation job (theme or description mode) |
/api/matrix/ |
GET |
List all matrices |
/api/matrix/{id} |
GET |
Get a matrix with all cells |
/api/matrix/{id}/events |
GET |
SSE stream of generation progress |
/api/matrix/{id}/generate-images |
POST |
Bulk-generate background images for an existing matrix |
/api/matrix/{id}/revalidate |
POST |
Run a validation pass; accepts {"user_comment": "..."} for extra instructions |
Local dev note: if running without Docker, use install-deps.sh to install Node/Python requirements and run download_fonts.py to fetch fonts.