A Vofy-powered skill for agents to use image and video generation models reliably.
Website • Quick Start • Discord
This repository gives agents (Claude Code, Codex, OpenCode, Cursor, OpenClaw...) a easy way to generate images and videos.
- Works across agent environments including Codex, Claude Code, Cursor, OpenCode, and similar tools.
- Builds on
vofy-cli, so agents can access 20+ modern image and video models through one consistent workflow. - Reduces common agent mistakes with clear rules for authentication, non-interactive commands, downloads, and task handling.
- Helps agents go from natural-language requests to real generated outputs with less setup and less guesswork.
skill-demo-4k-30fps.webm
User ❯ Create a cinematic 6-second video of a paper airplane
flying through neon-lit streets in Tokyo at night,
use seedance 2.0
Agent ❯ Generated with seedance-2.0
as a 6-second 16:9 720p text-to-video render.
Local file: video_hMQpifnYS9yMTXzJ.mp4.
npm install -g vofy-cli@0.1.3
vofy loginremember to run vofy login to fully set up the vofy-cli.
npx -y skills add WhiteTowerAI/imagine-skill --skill '*' -yInstall for specific agents:
# Codex
npx -y skills add WhiteTowerAI/imagine-skill --skill '*' --agent codex -y
# Claude Code
npx -y skills add WhiteTowerAI/imagine-skill --skill '*' --agent claude-code -y
# Cursor
npx -y skills add WhiteTowerAI/imagine-skill --skill '*' --agent cursor -y
# OpenCode
npx -y skills add WhiteTowerAI/imagine-skill --skill '*' --agent opencode -yNow tell your AI: /imagine create <what-you-want-to-generate>
You can also use /imagine for quick reference, imagine-models for model guides, and imagine-tasks to track the history runs.
Note
Also works with one-line shell install, manual install, and agent install. See more installation options.
vofy-cli is the execution layer. It talks to Vofy models, submits generation jobs, and returns files or result URLs.
imagine-skill is the agent layer. It teaches your AI agent when to use vofy image create, vofy video create, vofy tasks --plain, and other commands safely and consistently.
In practice, you install both:
- Install
vofy-cliso media generation commands are available. - Install
imagine-skillfor your AI tool so the agent knows the correct workflow. - Ask your agent for an image or video in natural language, and the skill guides the CLI usage behind the scenes.
| Skill | Purpose |
|---|---|
imagine |
CLI overview, command quick-reference, getting started |
imagine-create |
Media creation workflow — from prompt to downloaded result |
imagine-models |
Model selection guide + detailed capability reference |
imagine-tasks |
Task listing, detail view, result download |
Imagine-skill supports a growing set of image and video models across major providers.
Representative models:
| Provider | Image Models | Video Models |
|---|---|---|
gemini-3.1-flash-image-preview, gemini-3-pro-image-preview |
veo-3.1, veo-3.1-fast, veo-3.1-lite |
|
| OpenAI | gpt-image-1.5 |
sora-2, sora-2-pro |
| xAI | grok-imagine-image, grok-imagine-image-pro |
grok-imagine-video |
| ByteDance | seedream-4.5, seedream-5.0-lite |
seedance-1.5-pro, seedance-2.0, seedance-2.0-fast |
| Kling | - | kling-2.6, kling-3.0, kling-motion-control, kling-3.0-motion-control |
Note
All models available on Vofy can be accessed through imagine-skill, and pricing is currently the same as on the Vofy website.
MIT