StreamMirror is a lightweight real-time portrait stylization system.
A webcam feed is continuously transformed into anime / ink-wash / fantasy-style video at ~10–20 FPS on a consumer GPU (RTX 4070 Ti), streamed live to any browser — PC or mobile.
Built as an interactive digital-art installation for an undergraduate thesis in Digital Media Art.
| Camera input | Stylized output (DreamShaper 8) |
|---|---|
| (your face) | (anime character) |
MJPEG stream accessible at
http://localhost:5003(PC) orhttp://<LAN-IP>:5003/display(iPad / mobile).
- StreamDiffusion img2img — ~2-step inference, ~50–100 ms per frame
- Temporal blending —
output_t = 0.85 × SD(t) + 0.15 × output_{t-1}suppresses flicker - VRM 3D face mask — anime head model rendered via pyrender, blended over the camera frame before SD inference for structural guidance
- MediaPipe gesture control — point finger up/down to switch style models hands-free
- Hot model reload — swap SD 1.5
.safetensorsmodels at runtime with VRAM cleanup - Cross-device streaming — Flask + MJPEG, works in any browser including iPad Safari
- TAESD optional lightweight VAE decoder (~5 ms vs ~50 ms) for extra speed
- Fully offline —
HF_HUB_OFFLINE=1, no network calls during inference
| Component | Minimum | Tested |
|---|---|---|
| GPU | NVIDIA 8 GB VRAM | RTX 4070 Ti 12 GB |
| CUDA | 11.8+ | 12.1 |
| Python | 3.10 | 3.10 (conda) |
| OS | Windows 10/11 | Windows 10 22H2 |
Linux should work too — remove the
DSHOWcamera backend flag instream_app.pyif needed.
conda create -n streamproject python=3.10 -y
conda activate streamprojectpip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --index-url https://download.pytorch.org/whl/cu121pip install diffusers==0.24.0 transformers accelerate safetensors
pip install streamdiffusion
pip install flask opencv-python pillow numpy
pip install mediapipe
pip install pyrender trimesh pygltflibPlace files in the directories below (paths are configurable at the top of stream_app.py):
| File | Destination | Source |
|---|---|---|
Any SD 1.5 .safetensors |
models/ |
CivitAI / HuggingFace |
hand_landmarker.task |
models/ |
MediaPipe Models |
face_landmarker.task |
models/ |
MediaPipe Models |
TAESD config.json + diffusion_pytorch_model.safetensors |
taesd/ |
madebyollin/taesd |
| Any VRM head model | reference/animeface.vrm |
VRoid Hub or custom |
VRM and TAESD are optional. The system runs without them (falls back gracefully).
At the top of the file, update the hardcoded H:/streamproject2/... paths to match your machine:
SD_MODEL_PATH = "your/path/to/model.safetensors"
TAESD_PATH = "your/path/to/taesd"
MODELS_DIR = "your/path/to/models"
# ... etc.# Windows
run_stream.bat
# Or directly
python stream_app.pyOpen http://localhost:5003 in a browser.
On another device (same LAN): http://<your-PC-ip>:5003/display for the full-screen display view.
Gesture controls (requires webcam):
- Point index finger up (other fingers folded) → next model
- Point index finger down → previous model
- Max 4 switches per session; click Stop in the UI to reset
streamproject2/
├── stream_app.py # Main app — inference loop, Flask server, gesture thread
├── vrm_face.py # VRM 3D face renderer (pyrender + MediaPipe FaceLandmarker)
├── plot_thesis.py # Generate thesis figures from frame_log.csv
├── run_stream.bat # Windows launcher
├── taesd/ # TAESD lightweight VAE (not included, download separately)
│ ├── config.json
│ └── diffusion_pytorch_model.safetensors
├── models/ # SD 1.5 .safetensors + MediaPipe .task files (not included)
└── reference/ # VRM model file (not included)
└── animeface.vrm
If you want to reproduce the frame-diff and FPS charts:
- Run with
TEMPORAL_ALPHA = 0.15for ~60 s → renameframe_log.csv→frame_log_smooth.csv - Set
TEMPORAL_ALPHA = 0.0, run again → rename →frame_log_nosmooth.csv - Run
python plot_thesis.py→ outputsfig_temporal.pngandfig_fps.png
- Paths are hardcoded (H: drive) — edit the constants block at the top of
stream_app.py - Single-file architecture (~1300 lines) — prototype quality, not production-ready
- VRM face mask uses pixel blending, not a true ControlNet constraint
- Background stylization is uncontrolled (no segmentation)
- StreamDiffusion — Kodaira et al. 2023
- Stable Diffusion 1.5 — Rombach et al. 2022
- MediaPipe — Google
- pyrender — offscreen rendering
- Model weights from CivitAI community
MIT — see LICENSE.
Model weights have their own licenses (check each source).