Skip to content

Dor-J/LocalLlmToImage

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local LLM To Image

Local-first text-to-image application wired for a host-native Stable Diffusion 1.5 runtime using a local checkpoint. The backend is FastAPI, the frontend is React + TypeScript, and model execution stays isolated in a dedicated Windows virtual environment.

Runtime Assets

Default model/runtime paths:

  • models/Realistic_Vision/Realistic_Vision_V5.1_fp16-no-ema.safetensors
  • models/Realistic_Vision/generate_sd15.py
  • models/Realistic_Vision/requirements.runtime.txt
  • stable-diffusion-v1.5/
  • models/Realistic_Vision/.venv/Scripts/python.exe

Model binaries are intentionally local assets and should not be committed. Keep models/Realistic_Vision/generate_sd15.py and models/Realistic_Vision/requirements.runtime.txt in Git, then place any SD 1.5-compatible .safetensors checkpoint at LOCAL_IMAGE_MODEL_MAIN_PATH. Optional extra checkpoints can live in ignored folders such as models/checkpoints/.

Stable Diffusion v1.5 base directory (stable-diffusion-v1.5/)

This folder must provide a Diffusers-compatible layout for Stable Diffusion 1.5 (configs and tokenizer assets). It matches what Hugging Face publishes as runwayml/stable-diffusion-v1-5. Your single-file checkpoint path stays in LOCAL_IMAGE_MODEL_MAIN_PATH; it does not need to live inside stable-diffusion-v1.5/.

If the directory is missing pieces or you want a full upstream snapshot locally:

  1. Install the Hugging Face CLI (for example pip install huggingface_hub).
  2. Download the model into the repo root:
huggingface-cli download runwayml/stable-diffusion-v1-5 --local-dir .\stable-diffusion-v1.5

That download is large on disk; Git ignore rules exclude weight shards (for example *.bin) under stable-diffusion-v1.5/ so they are not committed by mistake.

Stable Diffusion v1.5 assets are subject to the CreativeML Open RAIL-M license. See NOTICE and the license summary on Hugging Face.

After downloading, run the manual sd15_runtime_check.py command under Manual preflight below to verify your layout, checkpoint path, and runtime Python.

Runtime Setup

Create the dedicated Stable Diffusion runtime environment:

.\scripts\setup-sd15-runtime.ps1

Optional CUDA wheel selector:

.\scripts\setup-sd15-runtime.ps1 -PyTorchComputePlatform cu126

The setup script will:

  • create models/Realistic_Vision/.venv if it is missing
  • install CUDA-enabled PyTorch into that venv
  • install diffusers, transformers, accelerate, safetensors, and Pillow
  • run scripts/sd15_runtime_check.py

Manual preflight:

.\models\Realistic_Vision\.venv\Scripts\python.exe .\scripts\sd15_runtime_check.py `
  --generator-script-path .\models\Realistic_Vision\generate_sd15.py `
  --runtime-source-dir .\stable-diffusion-v1.5 `
  --model-path .\models\Realistic_Vision\Realistic_Vision_V5.1_fp16-no-ema.safetensors

Backend Configuration

Environment variables use the LOCAL_IMAGE_ prefix.

LOCAL_IMAGE_MODEL_MAIN_PATH=models/Realistic_Vision/Realistic_Vision_V5.1_fp16-no-ema.safetensors
LOCAL_IMAGE_DEFAULT_MODEL_ID=realistic_vision
LOCAL_IMAGE_TEXT_ENCODER_PATH=stable-diffusion-v1.5/text_encoder
LOCAL_IMAGE_VAE_PATH=stable-diffusion-v1.5/vae
LOCAL_IMAGE_GENERATOR_SCRIPT_PATH=models/Realistic_Vision/generate_sd15.py
LOCAL_IMAGE_RUNTIME_SOURCE_DIR=stable-diffusion-v1.5
LOCAL_IMAGE_MODEL_PYTHON_EXECUTABLE=models/Realistic_Vision/.venv/Scripts/python.exe
LOCAL_IMAGE_RUNTIME_CUDA_ALLOC_CONF=expandable_segments:True
LOCAL_IMAGE_GENERATION_TIMEOUT_SECONDS=1800
LOCAL_IMAGE_API_KEY=
LOCAL_IMAGE_EXPOSE_ABSOLUTE_PATHS=false

LOCAL_IMAGE_MODEL_PYTHON_EXECUTABLE can be relative to the repo root or absolute. Set LOCAL_IMAGE_API_KEY to require X-Local-Image-API-Key on diagnostics, image listing, generation, and image-tool requests. Keep LOCAL_IMAGE_EXPOSE_ABSOLUTE_PATHS=false unless you specifically need full local paths in diagnostics.

The configured checkpoint is exposed by GET /api/v1/config as available_models. Generation requests can send model_id; if omitted, the backend uses LOCAL_IMAGE_DEFAULT_MODEL_ID.

Local Development

Start backend and frontend:

.\scripts\start-local-dev.ps1

Useful endpoints:

  • GET /api/v1/config
  • GET /api/v1/model/status
  • POST /api/v1/generate
  • POST /api/v1/generate/edit
  • POST /api/v1/tools/background-removal
  • POST /api/v1/tools/retouch
  • POST /api/v1/tools/upscale
  • POST /api/v1/tools/denoise
  • POST /api/v1/tools/crop
  • POST /api/v1/tools/color-correction
  • GET /api/v1/images

Generation Behavior

  • the API keeps a single long-lived image service in app.state
  • model status checks are cached briefly to reduce repeated subprocess preflight
  • image listing is cached briefly and invalidated after successful generation
  • generation runs in a thread and is serialized with a process-wide lock
  • each generation request uses one backend-registered checkpoint id
  • image-to-image requests accept a multipart source image through POST /api/v1/generate/edit
  • the web app can apply browser-side crop/aspect and color/exposure edits before submitting the source image
  • local image tools can preprocess image-to-image sources for background removal, retouching, upscaling, denoising, crop, and color correction
  • width and height must be divisible by 8
  • if a request hits CUDA OOM, the backend retries smaller preset sizes with the same orientation before failing

Security Defaults

  • backend scripts bind to 127.0.0.1 by default
  • mutating endpoints reject browser requests from origins outside LOCAL_IMAGE_CORS_ORIGINS
  • optional LOCAL_IMAGE_API_KEY protects diagnostics, image listing, generation, and image-tool requests
  • config, status, and image list responses redact absolute local paths by default
  • API docs and OpenAPI are disabled unless LOCAL_IMAGE_DEBUG=true
  • generated images are served only for raster image extensions
  • image uploads are size-limited and decoded with dimension/decompression-bomb checks
  • API responses include basic hardening headers such as X-Content-Type-Options: nosniff

Run dependency audits from the app directories:

cd apps/api
.\.venv\Scripts\python.exe -m pip_audit

cd ..\web
bun audit

Image Tools

The image-to-image form includes a pre-generation source editor. Lightweight crop/aspect and color/exposure edits run in the browser for instant feedback. Heavier tools call dedicated backend endpoints and return edited PNG files that become the active source image for generation.

Implemented tools:

  • background removal using local border-color alpha estimation
  • object removal / retouching using a painted mask and local blur/median fill
  • upscaling with Lanczos resampling and optional sharpening
  • denoising with median filtering and light sharpening
  • centered aspect-ratio crop
  • brightness, contrast, saturation, and warmth correction

These tools are intentionally separate from Stable Diffusion generation so they can be tested and evolved independently.

License

This project's source code is licensed under the Apache License 2.0. Third-party models, configs, and vendored components may use other terms; see NOTICE.

Troubleshooting

  • Runtime Python missing: set LOCAL_IMAGE_MODEL_PYTHON_EXECUTABLE correctly or re-run .\scripts\setup-sd15-runtime.ps1
  • Runtime not GPU-ready: verify the runtime venv has a CUDA-enabled PyTorch build and check GET /api/v1/model/status
  • Missing checkpoint or base model assets: confirm LOCAL_IMAGE_MODEL_MAIN_PATH points to an existing SD 1.5-compatible checkpoint and the config assets exist under stable-diffusion-v1.5/

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors