Skip to content

0palii/Stream_Mirror

Repository files navigation

StreamMirror — Real-time Portrait Stylization via StreamDiffusion

StreamMirror is a lightweight real-time portrait stylization system.
A webcam feed is continuously transformed into anime / ink-wash / fantasy-style video at ~10–20 FPS on a consumer GPU (RTX 4070 Ti), streamed live to any browser — PC or mobile.

Built as an interactive digital-art installation for an undergraduate thesis in Digital Media Art.


Demo

Camera input Stylized output (DreamShaper 8)
(your face) (anime character)

MJPEG stream accessible at http://localhost:5003 (PC) or http://<LAN-IP>:5003/display (iPad / mobile).


Features

  • StreamDiffusion img2img — ~2-step inference, ~50–100 ms per frame
  • Temporal blendingoutput_t = 0.85 × SD(t) + 0.15 × output_{t-1} suppresses flicker
  • VRM 3D face mask — anime head model rendered via pyrender, blended over the camera frame before SD inference for structural guidance
  • MediaPipe gesture control — point finger up/down to switch style models hands-free
  • Hot model reload — swap SD 1.5 .safetensors models at runtime with VRAM cleanup
  • Cross-device streaming — Flask + MJPEG, works in any browser including iPad Safari
  • TAESD optional lightweight VAE decoder (~5 ms vs ~50 ms) for extra speed
  • Fully offlineHF_HUB_OFFLINE=1, no network calls during inference

System Requirements

Component Minimum Tested
GPU NVIDIA 8 GB VRAM RTX 4070 Ti 12 GB
CUDA 11.8+ 12.1
Python 3.10 3.10 (conda)
OS Windows 10/11 Windows 10 22H2

Linux should work too — remove the DSHOW camera backend flag in stream_app.py if needed.


Installation

1. Create conda environment

conda create -n streamproject python=3.10 -y
conda activate streamproject

2. Install PyTorch (CUDA 12.1)

pip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --index-url https://download.pytorch.org/whl/cu121

3. Install dependencies

pip install diffusers==0.24.0 transformers accelerate safetensors
pip install streamdiffusion
pip install flask opencv-python pillow numpy
pip install mediapipe
pip install pyrender trimesh pygltflib

4. Download model files

Place files in the directories below (paths are configurable at the top of stream_app.py):

File Destination Source
Any SD 1.5 .safetensors models/ CivitAI / HuggingFace
hand_landmarker.task models/ MediaPipe Models
face_landmarker.task models/ MediaPipe Models
TAESD config.json + diffusion_pytorch_model.safetensors taesd/ madebyollin/taesd
Any VRM head model reference/animeface.vrm VRoid Hub or custom

VRM and TAESD are optional. The system runs without them (falls back gracefully).

5. Edit paths in stream_app.py

At the top of the file, update the hardcoded H:/streamproject2/... paths to match your machine:

SD_MODEL_PATH = "your/path/to/model.safetensors"
TAESD_PATH    = "your/path/to/taesd"
MODELS_DIR    = "your/path/to/models"
# ... etc.

Usage

# Windows
run_stream.bat

# Or directly
python stream_app.py

Open http://localhost:5003 in a browser.
On another device (same LAN): http://<your-PC-ip>:5003/display for the full-screen display view.

Gesture controls (requires webcam):

  • Point index finger up (other fingers folded) → next model
  • Point index finger down → previous model
  • Max 4 switches per session; click Stop in the UI to reset

Project Structure

streamproject2/
├── stream_app.py          # Main app — inference loop, Flask server, gesture thread
├── vrm_face.py            # VRM 3D face renderer (pyrender + MediaPipe FaceLandmarker)
├── plot_thesis.py         # Generate thesis figures from frame_log.csv
├── run_stream.bat         # Windows launcher
├── taesd/                 # TAESD lightweight VAE (not included, download separately)
│   ├── config.json
│   └── diffusion_pytorch_model.safetensors
├── models/                # SD 1.5 .safetensors + MediaPipe .task files (not included)
└── reference/             # VRM model file (not included)
    └── animeface.vrm

Generating Thesis Figures

If you want to reproduce the frame-diff and FPS charts:

  1. Run with TEMPORAL_ALPHA = 0.15 for ~60 s → rename frame_log.csvframe_log_smooth.csv
  2. Set TEMPORAL_ALPHA = 0.0, run again → rename → frame_log_nosmooth.csv
  3. Run python plot_thesis.py → outputs fig_temporal.png and fig_fps.png

Known Limitations

  • Paths are hardcoded (H: drive) — edit the constants block at the top of stream_app.py
  • Single-file architecture (~1300 lines) — prototype quality, not production-ready
  • VRM face mask uses pixel blending, not a true ControlNet constraint
  • Background stylization is uncontrolled (no segmentation)

Acknowledgements


License

MIT — see LICENSE.
Model weights have their own licenses (check each source).

About

StreamMirror — Real-time Portrait Stylization via StreamDiffusion **StreamMirror** is a lightweight real-time portrait stylization system. Built as an interactive digital-art installation for an undergraduate thesis in Digital Media Art.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors