nb — one CLI for image and video generation across Gemini, DashScope (wan), Volcengine Ark (Seedream / Seedance), and OpenAI gpt-image-2. Switch backends with --model, keep one history, no SDK juggling.
pip install nbcraft
nb config set api_key $GEMINI_API_KEY
nb gen -p "a tiny astronaut surfing a banana wave, neon dusk" --ar 16:9 --res 2KEach provider ships a different SDK, request shape, polling protocol, and quirk. nbcraft papers over that:
- One
nb genfor 8 image models — pick speed, price, region, or text-rendering strength. - One
nb video genfor Seedance 2.0 with submit/poll/download in a single call. - One history (
~/.nbcraft/history.json) tracks prompts, models, outputs across all backends. - No accounts you don't already have — bring whichever API keys you have; unused backends stay dormant.
| Alias | Backend | Model | Notes |
|---|---|---|---|
pro (default) |
Gemini | gemini-3-pro-image-preview |
High quality |
flash |
Gemini | gemini-3.1-flash-image-preview |
Faster, cheaper |
wan-pro |
DashScope 阿里云 | wan2.7-image-pro |
T2I 4K, edit ≤2K, 9 refs |
wan |
DashScope 阿里云 | wan2.7-image |
≤2K, faster |
seedream |
Volcengine Ark 火山方舟 | doubao-seedream-4-5-251128 |
4K, 14 refs, top text rendering |
seedream-lite |
Volcengine Ark 火山方舟 | doubao-seedream-5.0-lite |
Latest 5.0, faster/cheaper |
seedream-legacy |
Volcengine Ark 火山方舟 | doubao-seedream-4-0-250828 |
4.0 fallback |
gpt |
OpenAI | gpt-image-2 |
4K, ~99% character-level text accuracy |
| Alias | Backend | Model | Notes |
|---|---|---|---|
seedance (default) |
Volcengine Ark | doubao-seedance-2-0-260128 |
High quality, ~2-3 min/clip |
seedance-fast |
Volcengine Ark | doubao-seedance-2-0-fast-260128 |
~30-60s/clip, ~36% cheaper |
# Stable
pip install nbcraft
# Or from source
git clone https://github.com/jieyefriic/nbcraft
cd nbcraft
pip install -e .Requires Python 3.10+.
nbcraft reads keys from ~/.nbcraft/config.json first, then environment variables. Configure only the backends you actually use.
# Gemini (Google AI Studio: https://aistudio.google.com/app/apikey)
nb config set api_key <key> # or export GEMINI_API_KEY
# DashScope 阿里云 (https://dashscope.console.aliyun.com)
nb config set dashscope_api_key <key> # or export DASHSCOPE_API_KEY
# Volcengine Ark 火山方舟 (https://console.volcengine.com/ark)
nb config set ark_api_key <key> # or export ARK_API_KEY
# OpenAI (https://platform.openai.com/api-keys)
nb config set openai_api_key <key> # or export OPENAI_API_KEYAPI key alone is not enough. In the Ark console:
- Activate the model service: 系统管理 → 开通管理 → 视觉大模型 → activate Doubao-Seedream (and Seedance for video)
- (Optional) Create an inference endpoint if you prefer endpoint IDs: 在线推理 → 创建推理接入点 → copy
ep-xxx, thennb config set ark_endpoint_id ep-xxx - Generate API key: API Key 管理 → 创建
If you skip step 1, you'll get InvalidEndpointOrModel.NotFound on first call.
nb config set openai_base_url https://your-proxy/v1# Gemini default
nb gen -p "a watercolor portrait of a librarian cat"
# Reference image + aspect ratio + resolution
nb gen -p "make this scene at sunset" -i photo.jpg --ar 16:9 --res 2K -o sunset.png
# Seedream 4.5 — best for posters with embedded Chinese text
nb gen -p "promotional poster, headline 春日上新, soft pastel" --model seedream --ar 3:4 --res 4K
# GPT Image 2 — strong character-level text + agentic reasoning
nb gen -p "minimalist logo with the words Stay Curious" --model gpt --ar 1:1 --res 2K
# Multi-reference fusion (Seedream up to 14)
nb gen -p "outfit from img 1, pose from img 2" -i a.png -i b.png --model seedream
# Generate 4 variations in parallel
nb gen -p "a quiet study room" --repeat 4 -o studies.png
# → studies_1.png, studies_2.png, studies_3.png, studies_4.png
# Prompt from a file
nb gen -p @prompt.txt --model wan-pro --res 4K# Text-to-video, 5s, 720p, no audio (default)
nb video gen -p "a shiba inu turning under cherry blossoms, slow motion" --ar 16:9
# Image-to-video with first-frame
nb video gen -p "camera dollies in, then orbits the product" -i product.png --duration 8
# Cinema-grade 1080p with synced audio
nb video gen -p "a barista narrates over morning espresso" --res 1080p --duration 10 --audio
# Fast iteration draft
nb video gen -p "rough concept" --model seedance-fast --duration 4
# Submit only (don't wait); check later
nb video gen -p "..." --no-poll
nb video status <task_id>
nb video download <task_id> -o final.mp4--poll (default) they're downloaded immediately; with --no-poll, run nb video download before the window closes.
# tasks.jsonl
# {"key": "img1", "prompt": "...", "images": ["ref.png"], "ar": "3:4", "res": "2K"}
# {"key": "img2", "prompt": "..."}
nb batch submit -f tasks.jsonl --poll -o ./output/
nb batch list
nb batch status <job_name>
nb batch download <job_name> -o ./output/# Infer HTML structure from a UI screenshot
nb rev struct -i screenshot.png -o structure.md
# Describe an image
nb rev desc -i photo.png -p "analyze the color palette"nb history # last 20
nb history -s "poster" # filter by prompt
nb history <record_id> # full JSON detail
nb stats # totals, success/fail, monthly/daily| Command | What it does |
|---|---|
nb gen |
Generate image(s) from a text prompt |
nb video gen |
Generate video (Seedance 2.0) — submit + poll + download |
nb video status <id> |
Check Seedance task status |
nb video download <id> |
Download a finished Seedance video |
nb batch submit/status/download/list |
Gemini Batch API |
nb rev struct |
UI screenshot → inferred HTML structure (Gemini) |
nb rev desc |
General image description (Gemini) |
nb history |
Browse past generations |
nb config show / set |
Manage configuration |
nb stats |
Usage statistics |
Run nb <command> --help for full options.
nb config set default_model flash # Picked when --model is omitted
nb config set default_ar 3:4
nb config set default_res 2K
nb config set output_dir ./out
nb config set poll_interval 20 # Batch poll seconds- Reference image limits: Gemini 14, DashScope
wan*9, Seedream 14, OpenAIgpt16 nb batchis Gemini-only (other providers don't expose comparable batch APIs)nb revis Gemini-only (the image-gen models don't do image→text)- 4K: Gemini,
wan-pro(T2I only), all Seedream,gpt— native;wan4K downgrades to 2K - Negative prompt: native on DashScope, folded into the main prompt for Seedream/
gpt, ignored by Gemini - Volcengine Ark: API key alone is not enough — activate the model service in the console first
This repo also ships SKILL.md, a Claude Code skill description that lets Claude invoke nb automatically when you ask it to generate images. Install the skill once and Claude will pick the right model and arguments based on the request.
git clone https://github.com/jieyefriic/nbcraft
cd nbcraft
pip install -e ".[dev]"
pytest # run tests
ruff check . # lint
ruff format . # formatSee CONTRIBUTING.md for the full workflow.