A three-step character LoRA builder:
- Extract crops of one or more characters from a video using reference images.
- Auto-tag each crop with WD14 danbooru tags and natural-language captions, then reorganize the dataset from the UI.
- Train a LoRA per character on Anima with the parameters already wired in.
The extractor and tagger are model-agnostic and produce output sized for kohya-ss / OneTrainer / sd-scripts on SDXL-class anime models (Pony, Illustrious, NoobAI, vanilla SDXL). The trainer targets Anima.
For each video:
- PySceneDetect splits it into shots.
- DeepGHS YOLO (via
imgutils) detects characters per frame. - ByteTrack links detections into per-shot tracklets.
- CCIP matches tracklets to each character's reference images and assigns tracklets to whichever character scores best (or rejects them if no character matches).
- 1–3 frames per kept tracklet are picked by sharpness, visibility, and aspect ratio.
- Each pick is cropped at longest-side 1024 with the original background.
- CCIP runs over the kept crops a second time to drop near-duplicates inside a sliding window.
- WD14 EVA02-Large v3 writes a kohya-style
.txtnext to each.png.
Detections and tracklets are cached so threshold re-runs skip the slow stages.
System packages (install_and_run.sh installs these via apt on Debian/Ubuntu/WSL2; install manually on other distros):
ffmpeg(used by the UI for video thumbnails, segment previews, and probing)git(used to clone the trainer)
For the extractor/tagger:
- NVIDIA GPU, 4 GB VRAM minimum, 8 GB comfortable
For the trainer:
- Linux / WSL2 with CUDA 12.4+
- NVIDIA GPU, 6 GB VRAM minimum, 18 GB for full res LoRA
git clone https://github.com/negaga53/neme-anima.git
cd neme-anima
bash install_and_run.shThe script installs uv and Node.js if they aren't already on the system, syncs the Python deps, builds the frontend, clones tdrussell/diffusion-pipe into ~/diffusion-pipe and sets up its Python 3.12 venv, downloads the three Anima training weights (~14 GB) from HuggingFace, prefills the four trainer paths in the UI's Settings tab, and starts the server.
Re-running it is safe (will skip anything already in place). To update an existing one-click install, run git pull --ff-only and then run bash install_and_run.sh again; the script rebuilds the web UI.
Useful environment overrides:
| Variable | Default | What it does |
|---|---|---|
DIFFUSION_PIPE_DIR |
~/diffusion-pipe |
Where to clone diffusion-pipe |
DIFFUSION_PIPE_PYTHON |
3.12 |
Python interpreter/version for diffusion-pipe's venv |
MODELS_DIR |
~/.cache/neme-anima/models |
Where to put the downloaded weights |
SKIP_MODELS=1 |
off | Skip the 14 GB weight download |
SKIP_LAUNCH=1 |
off | Install everything, but don't start the UI at the end |
Linux / WSL2 only. On Linux, Node.js is installed through apt (with sudo) when available, and through nvm otherwise.
uv sync --group gpuFirst run downloads ~2.8 GB of weights (anime YOLOv8 person + face, CCIP, isnetis/anime-seg, WD14 with embeddings, CLIP base) to ~/.cache/huggingface/hub/.
uv run neme-anima project create ~/neme-projects/megumin --name megumin
uv run neme-anima project add-ref ~/neme-projects/megumin /path/to/portrait.png
uv run neme-anima project add-video ~/neme-projects/megumin /path/to/ep01.mkv
uv run neme-anima project add-video ~/neme-projects/megumin /path/to/ep02.mkv
uv run neme-anima project extract ~/neme-projects/meguminProject folder layout:
~/neme-projects/megumin/
project.json
refs/
output/
kept/ ep01__s003_t012_f000847.png + .txt
rejected/
metadata.jsonl
cache/<stem>/ scenes.parquet, tracklets.parquet
Re-run with new thresholds (skips detection + tracking):
uv run neme-anima project rerun ~/neme-projects/megumin --video ep01After cloning the repository, or after git pull when updating, install/update the frontend dependencies and rebuild the static UI bundle:
cd frontend && npm install && npm run build && cd ..git pull alone can leave you running the old UI because the built files in src/neme_anima/server/static/ are generated locally and not committed.
Then start the server:
uv run neme-anima uiBinds to 127.0.0.1:<random-port> and opens the SPA. Tabs: Sources, Frames, Training, Settings.
Add MKV/MP4 videos and reference images.
- Add or remove images from the dataset (using drag&drop).
- Edit tags inline by clicking a pill; edit the natural-language description in the same panel.
- Search across the dataset by tag.
- Bulk-edit tags with regex replace, with live preview.
- Re-crop any image.
- Filter by character with the chips at the top of the tab. Move a frame to a different character or also-assign it to a second character.
Selection: shift-click ranges, ctrl-click multi-toggle, A select all, D / Esc deselect. Hover a thumbnail for the tag overlay.
LoRA training with stop/resume and checkpoint retention. Targets Anima. One LoRA per character, queued sequentially.
Two per-character knobs sit alongside the standard rank/alpha/lr settings:
- Core-tag pruning. Compute the tags that show up in more than X% of a character's frames (default 35%) and drop them from the captions at staging time. Turns "long hair, blue eyes, school uniform, smile, ..." into just "smile, ..." for a character whose hair, eyes, and outfit are constants. The LoRA learns those from the visuals; the caption only adds noise. Off by default; opt in once you've reviewed the suggested list.
- Repeat multiplier. Over- or under-sample a character's frames in the dataset.
0.0is auto, computed from relative frame counts so a 50-frame character isn't drowned by a 500-frame one. A positive value pins it manually.
Training is run through tdrussell/diffusion-pipe, which has to be set up separately:
git clone https://github.com/tdrussell/diffusion-pipe ~/diffusion-pipe
cd ~/diffusion-pipe
uv venv --python 3.12
uv pip install --python .venv/bin/python -r requirements.txtThen in the Settings tab, point diffusion_pipe_dir at that clone and set the Anima DiT, Qwen VAE, and Qwen 3 0.6B text encoder paths (separate download on Huggingface).
Per-project threshold overrides (frame stride, identification distance, crop padding, etc.).
Project state lives in the project folder. The only server-side file is ~/.neme-anima/db.sqlite (project registry).



