A Tkinter desktop app that uses Stable Diffusion XL (SDXL) via the diffusers library to generate images from text prompts, with optional reference-image conditioning (img2img).
# NVIDIA GPU (recommended for RTX 5080 / 16 GB VRAM)
install_cuda.bat # creates venv, installs CUDA PyTorch + deps
venv\Scripts\activate
python main.pyTo pre-download the configured model:
python .\scripts\download_model.pyModel will be saved into models_cache/<model_id_safe_name>.
- Text-to-image and img2img generation with SDXL.
- Reference images (up to 3): images are blended and used as the init image for img2img with adjustable strength.
- Model ID dropdown (default models:
stabilityai/stable-diffusion-xl-base-1.0,SG161222/RealVisXL_V5.0,RunDiffusion/Juggernaut-XL-v9). - Adjustable inference steps (10–100, default 35).
- CFG scale slider (defaults to the realism-friendly 4–5 range).
- Adjustable output size (width/height) with size presets.
- Style presets (Photoreal portrait, Anime, Cinematic).
- Optional LoRA attachment with preset picklist (Fabricated Reality, Objective Reality, Face Helper XL) that auto-downloads weights on first use.
- SDXL optimizations (applied automatically):
- fp16 inference for ~50% less VRAM.
- Attention slicing.
- VAE slicing/tiling for lower VRAM usage.
- Optional xFormers memory‑efficient attention (if installed).
- DPMSolver++ (Karras) scheduler for sharper outputs.
- Upscale 2x (Lanczos) and AI Upscale (SD x4 upscaler pipeline).
- Image preview, auto-caching, and save as PNG/JPEG.
- Dark / light mode toggle.
- Python 3.10+ (recommended).
- A modern NVIDIA GPU with ≥ 8 GB VRAM is strongly recommended.
- This project is tuned for SDXL fp16 and works very well with 16 GB VRAM GPUs (e.g. RTX 5080).
- Internet access on first run to download model weights from Hugging Face.
Python dependencies are listed in requirements.txt:
transformers==4.57.1
accelerate==1.12.0
safetensors==0.7.0
numpy
pillow
diffusers[torch]==0.35.2
peft==0.18.1
huggingface-hub>=0.34.0,<1.0
For NVIDIA GPUs, use install_cuda.bat which installs PyTorch with CUDA 12.6 support. Then:
pip install -r requirements.txt(Optional but recommended for faster attention on GPU):
pip install xformerspython main.pyThis opens the Halfax Image Generator window.
- Prompt / Negative prompt — Enter descriptive text. Use style presets to auto-fill.
- Model — Pick from the dropdown or paste any compatible SDXL model ID.
- Steps — 10–100, default 35. 20–35 is usually enough.
- W×H — Output resolution. Clamped 256–1536, multiples of 8. Use size presets for common SDXL resolutions.
- Reference images — Add up to 3 images. They are blended equally and used as the init image for img2img generation. Adjust Strength (0.1 = subtle influence, 1.0 = fully replace with ref). Without references, pure text-to-image is used.
- Generate — Loads the model (if needed) and generates. Progress bar tracks steps.
- Upscale 2x / AI Upscale — Post-generation upscaling options.
- Save Image — Save as PNG/JPEG.
- GUI: Tkinter + ttk in
picture_ai/app.py. - Backend:
StableDiffusionXLPipeline(text2img) andStableDiffusionXLImg2ImgPipeline(img2img, sharing GPU weights) fromdiffusers, managed inpicture_ai/pipeline_manager.py. - The pipeline:
- Uses fp16 precision by default.
- Picks
cudaif available, elsecpu. - Applies attention slicing, VAE slicing/tiling, xFormers (if available), and DPMSolver++ (Karras).
- Loads and fuses LoRA adapters when configured.
- LoRA presets are pre-fetched on app launch in background threads.
- Settings are persisted to
config/settings.json.
-
Built-in presets: Fabricated Reality, Objective Reality, Face Helper XL.
-
Add custom presets via
config/lora_presets.json:[ { "label": "My Favorite LoRA", "source": "username/my-awesome-lora", "weight_name": null, "scale": 0.8 } ] -
Presets are downloaded into
models_cache/loras/<safe-name>. The downloader usestoken.txt,HUGGINGFACE_HUB_TOKEN,HF_TOKEN, orconfig/hf_token.txt.
- Models are stored under
models_cache/. Each model id gets its own subfolder. - On Windows you might see a warning about symlinks. This does not affect functionality; suppress with
HF_HUB_DISABLE_SYMLINKS_WARNING=1.
-
Out of memory (CUDA OOM)
- Reduce width/height (e.g. 768×768).
- Reduce steps.
- Close other GPU-intensive applications.
-
LoRA download fails
- Ensure internet access and valid HF token for private repos.
- Place a token in
token.txtorconfig/hf_token.txt. - Manually place weights in
models_cache/loras/<safe-name>.
-
Very slow generation
- Ensure CUDA PyTorch is installed (not CPU-only):
python -c "import torch; print(torch.cuda.is_available())". - Install xFormers:
pip install xformers. - Lower steps or resolution.
- Ensure CUDA PyTorch is installed (not CPU-only):
-
ImportError for
torch,diffusers,PIL, etc.- Make sure your virtual environment is active.
- Reinstall:
pip install -r requirements.txt