AnimTOON: Token-Efficient Vector Animation Generation

3-4x fewer tokens than OmniLottie (CVPR 2026) for generating Lottie animations, running on a single consumer GPU.

Model (v3): huggingface.co/srk0102200/AnimTOON-3B — now with character animation support

v3 New: Multi-part SVG animation, coordinated 14-layer character idle/walk cycles, trained on Spine + DragonBones skeletal data

AnimTOON is a compact, plain-text animation format designed for LLMs to generate Lottie animations with minimal tokens. Unlike existing approaches that require custom tokenizers and large GPU clusters, AnimTOON works with any LLM and runs on consumer hardware.

What's New in v3

Character Animation: 14-layer coordinated walk/idle cycles from text
Multi-Part SVG: Animate individual parts of complex SVGs (47-part crab demo)
Spine/DragonBones Training: Model understands skeletal hierarchy (arms, legs, head, torso)
Per-Part Anchor Points: Each SVG part rotates around its own center (no more flying parts)

Demo: AnimTOON vs OmniLottie (Same Prompt)

Crab Animation (47-part SVG)

AnimTOON (Ours)	OmniLottie (CVPR 2026)

1024 tokens / 82s / 30fps	2001 tokens / 55s+ / 8fps
Real SVG + per-part animation	AI-generated (not a crab)

Apple Logo Animation

AnimTOON (Ours)	OmniLottie (CVPR 2026)

166 tokens / 26s / 30fps	4095 tokens / 55s+ / 8fps
Real SVG shape + AI animation	AI-generated shape (incorrect)

Why the difference? OmniLottie tries to generate shapes AND animation in one model — leading to hallucinated shapes and token bloat. AnimTOON separates concerns: SVG provides perfect shapes, model focuses 100% on animation.

Benchmark Results

Token Efficiency (Measured, Same Prompt)

Metric	AnimTOON	OmniLottie	Raw Lottie JSON
Avg Output Tokens	166-597	616-4095	18,202
Token Reduction	98.8% vs JSON	~97% vs JSON	baseline
AnimTOON vs OmniLottie	5-7x fewer	baseline	-

Side-by-Side Comparison

Metric	AnimTOON	OmniLottie
Output Tokens (simple)	166	616
Output Tokens (complex)	597	4095
Shape + Animation Tokens	207 (41+166)	1113
Generation Time	13-38s	55-120s+
Frame Rate	30 fps	8 fps
VRAM (inference)	~5 GB	~15.2 GB
Model Size	3B (LoRA)	4B (full fine-tune)
Training Hardware	1x RTX 5060 Ti 16GB	Multi-GPU cluster
Custom Tokenizer	No (plain text)	Yes (40k custom tokens)
Accepts SVG Input	Yes	No
Generates Shapes	No (uses SVG)	Yes (limited quality)
Format	Plain text (any LLM)	Custom parameterized tokens
File Size	1-4 KB	20-175 KB
Format Success Rate	100% (converter guarantees)	88.3%

What is AnimTOON?

AnimTOON is a human-readable, token-efficient text format that describes Lottie animations:

anim fr=30 dur=120

layer Logo shape
  fill #000000
  path sh x2
  pos [0.5,0.5]
  rot 0.0->-67 0.04->46 0.14->-31 0.28->0 ease=bounce
  scale 0.0->[0,0] 0.14->[90,90] 0.28->[100,100] ease=smooth
  opacity 0.0->0 0.14->100 ease=fade

This 166-token output produces a complete animated .lottie file with bounce entrance, rotation wobble, and fade-in.

The same animation in raw Lottie JSON would be 18,000+ tokens.

How It Works

                    AnimTOON Pipeline
    +--------+     +---------+     +-----------+     +--------+
    |  SVG   | --> | Prompt  | --> | AnimTOON  | --> |.lottie |
    |  File  |     | Builder |     |  Model    |     |  File  |
    +--------+     +---------+     +-----------+     +--------+
                                        |
                                   Generates only
                                   animation text
                                   (166 tokens)
                                        |
                                        v
                                  +-----------+
                                  | Converter |
                                  | (toon_    |
                                  | animator) |
                                  +-----------+
                                        |
                              Combines SVG paths +
                              model animations
                              into .lottie file

Key insight: The model only generates animation keyframes (166 tokens). Shapes come from the SVG file. The converter deterministically builds valid Lottie JSON. This separation is why we achieve 98.8% token reduction.

Architecture

Why AnimTOON is Different

Approach	What Model Outputs	Tokens	Quality
Raw JSON	Full Lottie JSON with shapes + metadata	18,000+	Fragile, many format errors
OmniLottie	Custom parameterized tokens (shapes + animation)	486-4095	Good but needs custom tokenizer
AnimTOON	Plain text animation keyframes only	166-597	100% valid (converter guarantees)

The AnimTOON Format

anim fr=30 dur=120          # framerate, duration

layer Body shape             # layer name + type
  fill #4A90D9               # color
  path ellipse w=0.2 h=0.2  # shape (or 'sh' for complex bezier)
  pos [0.5,0.5]             # position (normalized 0-1)
  rot 0.0->0 1.0->360 ease=linear        # rotation keyframes
  scale 0.0->[0,0] 0.2->[100,100]        # scale keyframes
  opacity 0.0->0 0.3->100 ease=fade      # opacity keyframes

Supported properties: position, rotation, scale, opacity, fill, stroke, path (ellipse/rect/sh)

Easing types: smooth, linear, bounce, fade

Current Status

This is an early research release. The model is approximately 60% through training. Full model release coming after extended training on cloud GPU.

What Works Now (v3)

AnimTOON format specification (complete)
Bidirectional converter: AnimTOON <-> Lottie JSON <-> .lottie (100% reliable)
SVG -> animated .lottie pipeline (python-lottie + model + converter)
Simple icon/logo animations: pulse, bounce, spin, fade, wobble, scale
Character animations: 14-layer coordinated walk/idle cycles
Multi-part SVG animation: 47-part crab with per-part rotation
Per-shape-group anchor point calculation (BBox centroid)
Correct color matching from text descriptions
98.8% token reduction vs raw Lottie JSON
Trained on 100k MMLottie + 10k layer-aware + 984 Spine/DragonBones character data

Limitations (v3)

No shape generation (requires SVG input)
Model output varies between runs (temperature-dependent)
Position animation on shape groups still breaks layout (rotation/scale only)
Complex multi-character scenes not yet supported
Not yet trained on facial expressions (blink/smile/talk)

Roadmap

v1.0: Icon/logo animation from text + SVG input
v2.0: Layer-aware animation (understands SVG structure)
v3.0 (Current): Character animation with Spine/DragonBones skeletal data
v4.0 (Next): Live2D facial expressions + anime character animation
v5.0: Full manga-to-anime pipeline (panel → animated scene)

Quick Start

Installation

# Clone
git clone https://github.com/srk0102/svg-animator.git
cd svg-animator

# Setup (Windows)
setup.bat

# Setup (Mac/Linux)
chmod +x setup.sh && ./setup.sh

Generate Animation from SVG

python test_svg_pipeline.py inputs/apple.svg
# Output: outputs/apple.lottie
# Preview at: https://lottiefiles.com/preview

Generate Animation from Text

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("srk0102200/AnimTOON-3B")
model = AutoModelForCausalLM.from_pretrained(
    "srk0102200/AnimTOON-3B",
    dtype=torch.float16,
    device_map="cuda"
)

prompt = "a red circle pulsing in the center with a smooth bounce"
messages = [{"role": "user", "content": f"Generate AnimTOON animation: {prompt}"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to("cuda")

with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
result = tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(result)

Convert AnimTOON to .lottie

from src.toon_animator import animtoon_to_dotlottie_full

animtoon_text = """anim fr=30 dur=90
layer Circle shape
  fill #FF4F59
  path ellipse w=0.2 h=0.2
  pos [0.5,0.5]
  scale 0.0->[0,0] 0.15->[120,120] 0.3->[100,100] ease=smooth
  opacity 0.0->0 0.1->100 ease=fade
"""

animtoon_to_dotlottie_full(animtoon_text, "output.lottie")
# Preview at https://lottiefiles.com/preview

Training Your Own Model

Data Generation

# Download training data from MMLottie-2M (100k samples)
python src/dataset_pipeline.py --limit 100000 --output data/animtoon_train.jsonl

# Generate layer-aware training data
python src/gen_layer_data.py

Training with Unsloth

# Setup Unsloth environment
setup_unsloth.bat  # Windows
# or
./setup_unsloth.sh  # Mac/Linux

# Train
python src/train_unsloth.py \
  --data data/animtoon_train.jsonl \
  --model Qwen/Qwen2.5-3B-Instruct \
  --output models/animtoon-3b \
  --epochs 3 \
  --lora-dropout 0

Merge LoRA for Distribution

python merge_lora.py
# Creates models/animtoon-3b-v2-merged/

Project Structure

svg-animator/
  src/
    toon_animator.py      # AnimTOON <-> Lottie converter (core)
    dataset_pipeline.py   # MMLottie-2M data download + conversion
    train_unsloth.py      # LoRA training with Unsloth
    test_inference.py     # Model inference + .lottie generation
    prompt_builder.py     # SVG -> structured prompt
    gen_layer_data.py     # Generate layer-aware training data
    svg_animate.py        # SVG + AnimTOON -> animated Lottie
  test_svg_pipeline.py    # Full SVG animation pipeline
  benchmark_compare.py    # AnimTOON vs OmniLottie benchmark
  merge_lora.py           # Merge LoRA into base model
  inputs/                 # Test SVG files
  outputs/                # Generated .lottie files
  models/                 # Trained model checkpoints
  data/                   # Training data

Technical Details

Training Configuration

Parameter	Value
Base Model	Qwen/Qwen2.5-3B-Instruct
Method	LoRA (r=16, alpha=32)
Training Data	99,650 (MMLottie-2M) + 10,000 (layer-aware) + 984 (Spine/DragonBones)
Epochs	~2 (100k) + 3 (10k layers) + 3 (984 character)
Hardware	1x NVIDIA RTX 5060 Ti (16GB VRAM)
Framework	Unsloth + Transformers
Max Length	1024 tokens
Batch Size	1 x 16 (gradient accumulation)
Final Loss	0.47 (100k run) / 0.24 (layer-aware run)

Token Reduction Analysis

From 99,650 training samples (MMLottie-2M):

Average original Lottie JSON: 18,202 tokens
Average AnimTOON output: 222 tokens
Average token reduction: 98.8%
Average layers per animation: 6.2

Comparison with Related Work

Method	Year	Tokens	Shapes	Animation	Custom Tokenizer	Hardware
Raw Lottie JSON	-	18,000+	Yes	Yes	No	Any
OmniLottie	CVPR 2026	486-4095	Yes	Yes	Yes (40k tokens)	Multi-GPU
LLM4SVG	CVPR 2025	~500-2000	Yes	No	No	Multi-GPU
SVGDreamer	CVPR 2024	N/A	Yes	No	N/A	GPU
AnimTOON	2026	166-597	Via SVG	Yes	No (plain text)	1x Consumer GPU

Vision: Anime Character Animation

The AnimTOON format is designed to scale to complex character animations:

# Future: anime character blink animation
anim fr=24 dur=24

layer left_eye shape
  scale 0.0->[100,100] 0.4->[100,10] 0.5->[100,100] ease=smooth

layer right_eye shape
  scale 0.0->[100,100] 0.4->[100,10] 0.5->[100,100] ease=smooth

layer hair shape
  rot 0.0->-2 0.5->2 1.0->-2 ease=smooth

Training on Spine/Live2D rigging data will teach the model joint constraints, coordinated multi-layer motion, and character-specific animation patterns.

License

MIT License

Citation

@misc{sivaramakrishna2026animtoon,
  title={AnimTOON: Token-Efficient Vector Animation Generation via Compact Text Format},
  author={Siva RamaKrishna},
  year={2026},
  url={https://github.com/srk0102/svg-animator}
}

Acknowledgments

OmniLottie - MMLottie-2M dataset and benchmark inspiration
Unsloth - Fast LoRA training
python-lottie - SVG to Lottie conversion
LottieFiles - Lottie preview and community

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data/spine_examples		data/spine_examples
gif		gif
inputs		inputs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
MODEL_CARD.md		MODEL_CARD.md
README.md		README.md
audit_data.py		audit_data.py
audit_data2.py		audit_data2.py
audit_data3.py		audit_data3.py
benchmark_compare.py		benchmark_compare.py
check_data.py		check_data.py
merge_lora.py		merge_lora.py
model_card.py		model_card.py
requirements.txt		requirements.txt
setup.bat		setup.bat
setup.sh		setup.sh
setup_unsloth.bat		setup_unsloth.bat
setup_unsloth.sh		setup_unsloth.sh
test_svg_pipeline.py		test_svg_pipeline.py

Folders and files

Latest commit

History

Repository files navigation

AnimTOON: Token-Efficient Vector Animation Generation

What's New in v3

Demo: AnimTOON vs OmniLottie (Same Prompt)

Crab Animation (47-part SVG)

Apple Logo Animation

Benchmark Results

Token Efficiency (Measured, Same Prompt)

Side-by-Side Comparison

What is AnimTOON?

How It Works

Architecture

Why AnimTOON is Different

The AnimTOON Format

Current Status

What Works Now (v3)

Limitations (v3)

Roadmap

Quick Start

Installation

Generate Animation from SVG

Generate Animation from Text

Convert AnimTOON to .lottie

Training Your Own Model

Data Generation

Training with Unsloth

Merge LoRA for Distribution

Project Structure

Technical Details

Training Configuration

Token Reduction Analysis

Comparison with Related Work

Vision: Anime Character Animation

License

Citation

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages