Dissolution

Exhibited at IT Studio Academis, Berlin (March 2026) as part of the Pixels2GenAI project.

Acknowledgments

This installation applies the learnings from two modules I authored for the Pixels2GenAI thesis project:

Module 12.3.2 — ControlNet Guided Generation — the diffusion + ControlNet pipeline that drives the anime generation here.
Module 11.2.3 — Face Detection — the MediaPipe face-landmark work that feeds the hybrid edge map into ControlNet.

The wider curriculum is available at Pixels2GenAI.

How it works

The piece alternates two cycles:

Cycle A — self → anime. A webcam snapshot is progressively noised toward pure Gaussian noise, then the reverse diffusion is driven by Stable Diffusion + ControlNet (conditioned on a hybrid edge map: Canny + MediaPipe face mesh + pose skeleton + hand landmarks) into an anime illustration of the visitor.
Cycle B — anime → self. The anime image dissolves back into noise and crystallizes into the live webcam view, closing the loop.

Forward diffusion uses the DDPM formula

x_t = sqrt(ᾱ_t) · x_0 + sqrt(1 - ᾱ_t) · ε

with selectable linear, cosine, or quadratic noise schedules.

Generation runs on a background thread (~3 s per image on a modern GPU) while the display loop stays at 30 FPS on the main thread. A side panel shows the live webcam, detection overlays, the combined-edges ControlNet input, the noise schedule curve, and the current state, so viewers can see the math driving the output.

Requirements

Python 3.11 or newer
CUDA-capable GPU (for real-time generation); CPU-only machines can run --no-controlnet test mode
A webcam
~5 GB free disk for Stable Diffusion + ControlNet weights (auto-downloaded from Hugging Face on first run)

Installation

git clone https://github.com/<your-user>/dissolution.git
cd dissolution
pip install -r requirements.txt

MediaPipe task files and the Stable Diffusion / ControlNet checkpoints are fetched automatically on first launch. No manual downloads required.

Usage

# Full interactive mode
python dissolution_live.py

# Test mode (no GPU, fallback images)
python dissolution_live.py --no-controlnet

# Pick a different webcam
python dissolution_live.py --camera 1

# Record the main display
python dissolution_live.py --record out.mp4

Run python dissolution_live.py --help for the full list.

Controls

Key	Action
`SPACE`	Force-trigger dissolve during MIRROR / IDLE
`1` / `2` / `3`	Switch edge-detection preview (Canny / face mesh / pose)
`V`	Cycle through prompt presets
`0`	Toggle sound reactivity
`A`	Toggle ambient audio
`F`	Toggle fullscreen
`P`	Toggle process panel
`Q` / `ESC`	Quit

Flags

Flag	Purpose
`--no-controlnet`	Disable GPU generation; use fallback images
`--no-webcam`	Use bundled images instead of the webcam
`--camera N`	Webcam device index
`--resolution WxH`	Override display resolution
`--fullscreen`	Start fullscreen
`--record FILE.mp4`	Record the main display to MP4
`--prompt "..."`	Override the generation prompt
`--conditioning F`	ControlNet conditioning scale (0.0–1.0)
`--guidance F`	Classifier-free guidance scale
`--steps N`	Inference steps
`--sound-reactive`	Microphone-driven speed / brightness modulation
`--ambient-audio`	Ambient audio synthesis tracking diffusion state
`--exhibition`	Left-half layout for a 3440×1440 ultrawide
`--watchdog`	Auto-restart on crash
`--max-runtime H`	Graceful shutdown after N hours

Repository layout

.
├── dissolution_live.py   # Interactive exhibition piece
├── dissolution.py        # Offline diffusion-loop renderer
├── requirements.txt
├── LICENSE
└── README.md

At runtime the script creates two directories automatically:

mediapipe_models/ — auto-downloaded face/pose/hand task files (~42 MB)
dissolution_gallery/ — local archive of generated images and source snapshots

Both are gitignored.

Offline renderer

dissolution.py is a standalone, non-interactive script that renders a seamless palindrome loop between two still images (source → noise → target → noise → source). It is the mathematical reference for the diffusion process that dissolution_live.py animates in real time.

python dissolution.py                                  # full render
python dissolution.py --preview                        # quick low-res preview
python dissolution.py --schedule quadratic --seed 99   # experiment with schedules

Known quirks

First launch is slow. Stable Diffusion, the ControlNet checkpoint, and the MediaPipe task files all download on the first run (~5 GB). Subsequent launches start in seconds.
CUDA is effectively required. On CPU, generation takes minutes per image, not seconds — the piece does not work as intended. Use --no-controlnet for a UI-only test mode.
--exhibition is hardcoded for a specific monitor. It targets the left half of a 3440×1440 ultrawide (1720×1440 window at x=0). On any other display, use --resolution WxH and --fullscreen instead.
Webcam index is platform-dependent. If --camera 0 fails, try 1 or 2. On Linux/macOS, OpenCV sometimes needs cv2.CAP_V4L2 or cv2.CAP_AVFOUNDATION — the script uses the default backend.
DISSOLUTION_LEGACY_ASSETS env var. The code contains optional hooks for LoRA weights and a fabric-image fallback directory from an earlier iteration of the project. They are disabled by default; set the env var to an absolute path to re-enable. Leave unset for the anime pipeline.
Image ID counter persists across runs. dissolution_gallery/counter.txt tracks the last-issued DISS-XXXX id so restarts don't collide with previous captures. Delete the file to reset.
Generation is asynchronous. The display keeps running at 30 FPS even while the GPU is busy; a new anime image appears at the end of each cycle, not in lock-step with the dissolve animation.
Watchdog preserves the pipeline across crashes. --watchdog re-launches the display loop but keeps the already-loaded ControlNet model in memory to avoid the ~30 s reload cost.
No Apple Silicon path. The code targets CUDA. MPS may work with diffusers but has not been tested.

Privacy

During the exhibition, generated anime images and the corresponding source webcam captures were saved to dissolution_gallery/ for archival purposes. That folder is not included in this repository, because the source captures contain identifiable faces of visitors who did not consent to publication. Anyone running the piece locally will regenerate the folder as they use it.

License

MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dissolution

Acknowledgments

Table of Contents

How it works

Requirements

Installation

Usage

Controls

Flags

Repository layout

Offline renderer

Known quirks

Privacy

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dissolution.py		dissolution.py
dissolution_live.py		dissolution_live.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Dissolution

Acknowledgments

Table of Contents

How it works

Requirements

Installation

Usage

Controls

Flags

Repository layout

Offline renderer

Known quirks

Privacy

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages