Local-first text-to-image application wired for a host-native Stable Diffusion 1.5 runtime using a local checkpoint. The backend is FastAPI, the frontend is React + TypeScript, and model execution stays isolated in a dedicated Windows virtual environment.
Default model/runtime paths:
models/Realistic_Vision/Realistic_Vision_V5.1_fp16-no-ema.safetensorsmodels/Realistic_Vision/generate_sd15.pymodels/Realistic_Vision/requirements.runtime.txtstable-diffusion-v1.5/models/Realistic_Vision/.venv/Scripts/python.exe
Model binaries are intentionally local assets and should not be committed. Keep models/Realistic_Vision/generate_sd15.py and models/Realistic_Vision/requirements.runtime.txt in Git, then place any SD 1.5-compatible .safetensors checkpoint at LOCAL_IMAGE_MODEL_MAIN_PATH. Optional extra checkpoints can live in ignored folders such as models/checkpoints/.
This folder must provide a Diffusers-compatible layout for Stable Diffusion 1.5 (configs and tokenizer assets). It matches what Hugging Face publishes as runwayml/stable-diffusion-v1-5. Your single-file checkpoint path stays in LOCAL_IMAGE_MODEL_MAIN_PATH; it does not need to live inside stable-diffusion-v1.5/.
If the directory is missing pieces or you want a full upstream snapshot locally:
- Install the Hugging Face CLI (for example
pip install huggingface_hub). - Download the model into the repo root:
huggingface-cli download runwayml/stable-diffusion-v1-5 --local-dir .\stable-diffusion-v1.5That download is large on disk; Git ignore rules exclude weight shards (for example *.bin) under stable-diffusion-v1.5/ so they are not committed by mistake.
Stable Diffusion v1.5 assets are subject to the CreativeML Open RAIL-M license. See NOTICE and the license summary on Hugging Face.
After downloading, run the manual sd15_runtime_check.py command under Manual preflight below to verify your layout, checkpoint path, and runtime Python.
Create the dedicated Stable Diffusion runtime environment:
.\scripts\setup-sd15-runtime.ps1Optional CUDA wheel selector:
.\scripts\setup-sd15-runtime.ps1 -PyTorchComputePlatform cu126The setup script will:
- create
models/Realistic_Vision/.venvif it is missing - install CUDA-enabled PyTorch into that venv
- install
diffusers,transformers,accelerate,safetensors, andPillow - run
scripts/sd15_runtime_check.py
Manual preflight:
.\models\Realistic_Vision\.venv\Scripts\python.exe .\scripts\sd15_runtime_check.py `
--generator-script-path .\models\Realistic_Vision\generate_sd15.py `
--runtime-source-dir .\stable-diffusion-v1.5 `
--model-path .\models\Realistic_Vision\Realistic_Vision_V5.1_fp16-no-ema.safetensorsEnvironment variables use the LOCAL_IMAGE_ prefix.
LOCAL_IMAGE_MODEL_MAIN_PATH=models/Realistic_Vision/Realistic_Vision_V5.1_fp16-no-ema.safetensors
LOCAL_IMAGE_DEFAULT_MODEL_ID=realistic_vision
LOCAL_IMAGE_TEXT_ENCODER_PATH=stable-diffusion-v1.5/text_encoder
LOCAL_IMAGE_VAE_PATH=stable-diffusion-v1.5/vae
LOCAL_IMAGE_GENERATOR_SCRIPT_PATH=models/Realistic_Vision/generate_sd15.py
LOCAL_IMAGE_RUNTIME_SOURCE_DIR=stable-diffusion-v1.5
LOCAL_IMAGE_MODEL_PYTHON_EXECUTABLE=models/Realistic_Vision/.venv/Scripts/python.exe
LOCAL_IMAGE_RUNTIME_CUDA_ALLOC_CONF=expandable_segments:True
LOCAL_IMAGE_GENERATION_TIMEOUT_SECONDS=1800
LOCAL_IMAGE_API_KEY=
LOCAL_IMAGE_EXPOSE_ABSOLUTE_PATHS=falseLOCAL_IMAGE_MODEL_PYTHON_EXECUTABLE can be relative to the repo root or absolute.
Set LOCAL_IMAGE_API_KEY to require X-Local-Image-API-Key on diagnostics,
image listing, generation, and image-tool requests. Keep
LOCAL_IMAGE_EXPOSE_ABSOLUTE_PATHS=false unless you specifically need full
local paths in diagnostics.
The configured checkpoint is exposed by GET /api/v1/config as available_models. Generation requests can send model_id; if omitted, the backend uses LOCAL_IMAGE_DEFAULT_MODEL_ID.
Start backend and frontend:
.\scripts\start-local-dev.ps1Useful endpoints:
GET /api/v1/configGET /api/v1/model/statusPOST /api/v1/generatePOST /api/v1/generate/editPOST /api/v1/tools/background-removalPOST /api/v1/tools/retouchPOST /api/v1/tools/upscalePOST /api/v1/tools/denoisePOST /api/v1/tools/cropPOST /api/v1/tools/color-correctionGET /api/v1/images
- the API keeps a single long-lived image service in
app.state - model status checks are cached briefly to reduce repeated subprocess preflight
- image listing is cached briefly and invalidated after successful generation
- generation runs in a thread and is serialized with a process-wide lock
- each generation request uses one backend-registered checkpoint id
- image-to-image requests accept a multipart source image through
POST /api/v1/generate/edit - the web app can apply browser-side crop/aspect and color/exposure edits before submitting the source image
- local image tools can preprocess image-to-image sources for background removal, retouching, upscaling, denoising, crop, and color correction
- width and height must be divisible by
8 - if a request hits CUDA OOM, the backend retries smaller preset sizes with the same orientation before failing
- backend scripts bind to
127.0.0.1by default - mutating endpoints reject browser requests from origins outside
LOCAL_IMAGE_CORS_ORIGINS - optional
LOCAL_IMAGE_API_KEYprotects diagnostics, image listing, generation, and image-tool requests - config, status, and image list responses redact absolute local paths by default
- API docs and OpenAPI are disabled unless
LOCAL_IMAGE_DEBUG=true - generated images are served only for raster image extensions
- image uploads are size-limited and decoded with dimension/decompression-bomb checks
- API responses include basic hardening headers such as
X-Content-Type-Options: nosniff
Run dependency audits from the app directories:
cd apps/api
.\.venv\Scripts\python.exe -m pip_audit
cd ..\web
bun auditThe image-to-image form includes a pre-generation source editor. Lightweight crop/aspect and color/exposure edits run in the browser for instant feedback. Heavier tools call dedicated backend endpoints and return edited PNG files that become the active source image for generation.
Implemented tools:
- background removal using local border-color alpha estimation
- object removal / retouching using a painted mask and local blur/median fill
- upscaling with Lanczos resampling and optional sharpening
- denoising with median filtering and light sharpening
- centered aspect-ratio crop
- brightness, contrast, saturation, and warmth correction
These tools are intentionally separate from Stable Diffusion generation so they can be tested and evolved independently.
This project's source code is licensed under the Apache License 2.0. Third-party models, configs, and vendored components may use other terms; see NOTICE.
- Runtime Python missing:
set
LOCAL_IMAGE_MODEL_PYTHON_EXECUTABLEcorrectly or re-run.\scripts\setup-sd15-runtime.ps1 - Runtime not GPU-ready:
verify the runtime venv has a CUDA-enabled PyTorch build and check
GET /api/v1/model/status - Missing checkpoint or base model assets:
confirm
LOCAL_IMAGE_MODEL_MAIN_PATHpoints to an existing SD 1.5-compatible checkpoint and the config assets exist understable-diffusion-v1.5/