A Claude Code skill that generates short videos with Replicate's bytedance/seedance-2.0 model. Supports text-to-video and reference-image conditioning (image-to-video, multi-reference style/subject control).
When invoked, Claude will:
- Read
SKILL.mdto know the model defaults, prompting tips, and when to fall back to other tools. - Run
scripts/replicate_video.py generate ...with the right flags. - Save the MP4 to
./output_videos/seedance_<ts>.mp4and return aRESULT: { ... }JSON line with the local path + Replicate URL.
Clone into your project's skills folder:
# Anthropic's convention: skills live in .claude/skills (or ~/.claude/skills for user-wide)
git clone https://github.com/Bomx/replicate-video-skill .claude/skills/replicate-video
pip install -r .claude/skills/replicate-video/requirements.txtAdd your Replicate token to the project .env:
REPLICATE_API_TOKEN=r8_...
After restarting your session, type:
/replicate-video make a 7s cinematic drone pull-back over a misty pine forest, 1080p
Or just ask in plain English — the skill auto-triggers on phrases like "make a video", "animate this image", "text-to-video", etc.
# Text-to-video
python scripts/replicate_video.py generate \
--prompt "Cinematic aerial drone pulling back over a misty pine forest at sunrise, 35mm" \
--duration 7 --resolution 1080p --aspect-ratio 16:9
# Image-to-video with multiple references
python scripts/replicate_video.py generate \
--prompt "Aerial drone descends and enters through the front door of this inspection center" \
--reference-image ./storefront.jpg \
--reference-image ./interior_1.jpg \
--reference-image https://example.com/google_earth.png \
--duration 7 --resolution 1080p --seed 24353
# Vertical (TikTok / Reels / Shorts)
python scripts/replicate_video.py generate \
--prompt "..." --aspect-ratio 9:16 --duration 5
# With audio reference + generated audio
python scripts/replicate_video.py generate \
--prompt "..." --generate-audio --reference-audio ./voiceover.mp3The last stdout line is always machine-parseable:
RESULT: {"status":"succeeded","output_url":"https://replicate.delivery/.../tmp.mp4","local_path":"output_videos/seedance_1714857600.mp4","duration_s":7,"resolution":"1080p","aspect_ratio":"16:9","model":"bytedance/seedance-2.0","elapsed_s":110.0}
| Flag | Default | Notes |
|---|---|---|
--aspect-ratio |
16:9 |
also 9:16, 1:1, 4:3, 3:4, 21:9 |
--duration |
7 |
seconds, typical range 3–10 |
--resolution |
1080p |
also 720p, 480p |
--generate-audio |
off | flip on for ambient/sfx |
--seed |
random | pin for reproducibility |
--model |
bytedance/seedance-2.0 |
pass owner/name:hash to pin a version |
Seedance 2.0 generations take ~1.5–4 minutes depending on resolution and length. Each run costs real money (per Replicate's pricing). The CLI polls until done — Ctrl+C cancels.
- Lead with camera move: "aerial drone pulling back", "slow dolly in", "static wide", "handheld follow".
- Then subject + environment + lighting/lens: "35mm cinematic, golden hour, shallow depth of field".
- For interior reveals from exterior, name the transition: "camera approaches and enters through the front door; interior matches reference image 1".
- Tell the model what each reference is for: "reference 1 is the storefront, references 2–4 are interior, references 5+ are Google Earth overheads".
- Keep prompts under ~80 words. Seedance ignores very long prose.
MIT — see LICENSE.