-
Notifications
You must be signed in to change notification settings - Fork 0
Provider Google
Modes: image · video · audio · Models: 13
Vendor: Google AI for Developers · Vertex AI · Official API docs: Image · Video (Veo) · Music (Lyria)
Google contributes across all three modes: the Veo video family, the Nano Banana (Gemini Image) family, Gemini TTS, and Lyria music.
| id | Name | Mode | Input |
|---|---|---|---|
veo-3.1 |
Veo 3.1 | video | t2v |
veo-3.1-fast |
Veo 3.1 Fast | video | t2v |
veo-3.1-lite |
Veo 3.1 Lite | video | t2v |
gemini-3-pro-image |
Nano Banana Pro | image | t2i |
gemini-3.1-flash-image |
Nano Banana 2 | image | t2i |
gemini-2.5-flash-image |
Nano Banana | image | t2i |
gemini-2.5-flash-tts |
Gemini 2.5 Flash TTS | audio | tts |
gemini-2.5-pro-tts |
Gemini 2.5 Pro TTS | audio | tts |
lyria-3-clip |
Lyria 3 Clip | audio | music |
lyria-3-pro |
Lyria 3 Pro | audio | music |
gen-ai models --provider googlelists the current set (13 models).
gen-ai generate -m veo-3.1 -p "a drone shot over a snowy ridge at golden hour" \
--ar 16:9 -r 1080p -d 8 --audio-gen{ "name": "picsart_generate",
"arguments": { "model": "veo-3.1", "prompt": "a drone shot over a snowy ridge", "duration": 8, "resolution": "1080p", "generateAudio": true } }Veo clips are chainable with gen-ai extend (+7s per segment). Full params for every Veo / Gemini / Imagen / Lyria model are in Parameters below.
gen-ai generate -m gemini-3-pro-image -p "a cinematic product render of a smart speaker" --ar 16:9 -r 4Kgen-ai generate -m gemini-2.5-pro-tts -p "Here is your daily briefing." # speech
gen-ai generate -m lyria-3-pro -p "uplifting cinematic orchestral score" # musicFull parameter surface for every model, sourced from gen-ai models info <id> --json. CLI flags show the primary short form; the canonical --kebab-case long form always works too.
Try gemini-3.1-flash-image in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 3:2 · 2:3 · 4:5 · 5:4 · 4:1 · 1:4 · 8:1 · 1:8 · 21:9 (default 1:1) |
resolution |
-r |
enum |
0.5K · 1K · 2K · 4K (default 1K) |
count |
-n |
enum |
1 · 2 · 4 · 6 · 8 · 10 (default 1) |
thinkingLevel |
--thinking |
enum |
minimal (Minimal (faster)) · high (High (more reasoning)) (default minimal) |
imageUrls |
-i |
file | image (up to 14) |
Input type: t2v
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
16:9 · 9:16 (default 16:9) |
duration |
-d |
enum |
4 · 6 · 8 (default 8) |
resolution |
-r |
enum |
720p · 1080p · 4k (default 720p) |
imageUrls |
-i |
file | image (up to 3) |
generateAudio |
--audio-gen |
boolean |
true · false (default true) |
negativePrompt |
--neg |
text | free text |
startFrame |
--start-frame |
file | image |
endFrame |
--end-frame |
file | image |
Try veo-3.1-fast in Playground ↗
Input type: t2v
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
16:9 · 9:16 (default 16:9) |
duration |
-d |
enum |
4 · 6 · 8 (default 8) |
resolution |
-r |
enum |
720p · 1080p · 4k (default 720p) |
imageUrls |
-i |
file | image (up to 3) |
generateAudio |
--audio-gen |
boolean |
true · false (default true) |
negativePrompt |
--neg |
text | free text |
startFrame |
--start-frame |
file | image |
endFrame |
--end-frame |
file | image |
Try gemini-3-pro-image in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 2:3 · 21:9 (default 1:1) |
resolution |
-r |
enum |
1K · 2K · 4K (default 2K) |
count |
-n |
enum |
1 · 2 · 4 · 6 · 8 · 10 (default 1) |
thinkingBudget |
--thinking-budget |
integer |
128–32768, step 128, default 128
|
imageUrls |
-i |
file | image (up to 14) |
Try gemini-2.5-flash-image in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 2:3 · 21:9 (default 16:9) |
count |
-n |
enum |
1 · 2 · 4 · 6 · 8 · 10 (default 1) |
imageUrls |
-i |
file | image (up to 14) |
Try veo-3.1-lite in Playground ↗
Input type: t2v
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
16:9 · 9:16 (default 16:9) |
duration |
-d |
enum |
4 · 6 · 8 (default 8) |
resolution |
-r |
enum |
720p · 1080p (default 720p) |
startFrame |
--start-frame |
file | image |
Try gemini-2.5-flash-tts in Playground ↗
Input type: tts
| Param | CLI flag | Type | Values |
|---|---|---|---|
language |
--language |
text | free text |
accent |
--accent |
text | free text |
prompt |
-p |
text | required (≤5000 chars) |
voiceId |
--voice |
enum |
Aoede · Charon · Fenrir · Kore · Leda · Orus · Puck · Zephyr · Achernar · Achird · Algenib · Algieba · Alnilam · Autonoe · Despina · Enceladus · Erinome · Gacrux · Iapetus · Laomedeia · Pulcherrima · Rasalgethi · Sadachbia · Sadaltager · Schedar · Sulafat · Umbriel · Vindemiatrix · Zubenelgenubi (default Kore) |
Try gemini-2.5-pro-tts in Playground ↗
Input type: tts
| Param | CLI flag | Type | Values |
|---|---|---|---|
language |
--language |
text | free text |
accent |
--accent |
text | free text |
prompt |
-p |
text | required (≤5000 chars) |
voiceId |
--voice |
enum |
Aoede · Charon · Fenrir · Kore · Leda · Orus · Puck · Zephyr · Achernar · Achird · Algenib · Algieba · Alnilam · Autonoe · Despina · Enceladus · Erinome · Gacrux · Iapetus · Laomedeia · Pulcherrima · Rasalgethi · Sadachbia · Sadaltager · Schedar · Sulafat · Umbriel · Vindemiatrix · Zubenelgenubi (default Kore) |
Try imagen-4.0 in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1) |
count |
-n |
enum |
1 · 2 · 4 (default 1) |
enhancePrompt |
--enhance-prompt |
boolean |
true · false (default true) |
negativePrompt |
--neg |
text | free text |
Try imagen-4.0-ultra in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1) |
count |
-n |
enum |
1 · 2 · 4 (default 1) |
enhancePrompt |
--enhance-prompt |
boolean |
true · false (default true) |
negativePrompt |
--neg |
text | free text |
Try imagen-4.0-fast in Playground ↗
Input type: t2i
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
aspectRatio |
--ar |
enum |
1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1) |
count |
-n |
enum |
1 · 2 · 4 (default 1) |
enhancePrompt |
--enhance-prompt |
boolean |
true · false (default true) |
negativePrompt |
--neg |
text | free text |
Try lyria-3-clip in Playground ↗
Input type: music
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
imageUrls |
-i |
file | image (up to 1) |
Try lyria-3-pro in Playground ↗
Input type: music
| Param | CLI flag | Type | Values |
|---|---|---|---|
prompt |
-p |
text | required |
imageUrls |
-i |
file | image (up to 1) |
Notes: Veo audio is native (
generateAudio); Imagen and Gemini image models differ in resolution and reasoning controls (thinkingLevel/thinkingBudget). TTSvoiceIdvalues are Gemini voice presets.
Picsart CLI & MCP · Repo · AI Playground app
Getting Started
Interfaces
Concepts
Model Reference
Providers
- All providers
- Async
- ByteDance
- Creatify
- ElevenLabs
- Flux (Black Forest Labs)
- Grok (xAI)
- Happy Horse
- HeyGen
- Hunyuan
- Ideogram
- Kling
- LTX (Lightricks)
- Luma
- MiniMax
- OpenAI
- OVI
- Picsart
- Pika
- PixVerse
- Qwen (Alibaba)
- Recraft
- Reve
- Runway
- Seedance
- Seedream
- Topaz
- VEED
- Videography
- Wan (Alibaba)
More