Skip to content

Provider Google

hovhannes@picsart.com edited this page Jun 19, 2026 · 3 revisions

Google

Modes: image · video · audio · Models: 13

Vendor: Google AI for Developers · Vertex AI · Official API docs: Image · Video (Veo) · Music (Lyria)

Google contributes across all three modes: the Veo video family, the Nano Banana (Gemini Image) family, Gemini TTS, and Lyria music.

Models

id Name Mode Input
veo-3.1 Veo 3.1 video t2v
veo-3.1-fast Veo 3.1 Fast video t2v
veo-3.1-lite Veo 3.1 Lite video t2v
gemini-3-pro-image Nano Banana Pro image t2i
gemini-3.1-flash-image Nano Banana 2 image t2i
gemini-2.5-flash-image Nano Banana image t2i
gemini-2.5-flash-tts Gemini 2.5 Flash TTS audio tts
gemini-2.5-pro-tts Gemini 2.5 Pro TTS audio tts
lyria-3-clip Lyria 3 Clip audio music
lyria-3-pro Lyria 3 Pro audio music

gen-ai models --provider google lists the current set (13 models).

Veo 3.1 (video)

gen-ai generate -m veo-3.1 -p "a drone shot over a snowy ridge at golden hour" \
  --ar 16:9 -r 1080p -d 8 --audio-gen
{ "name": "picsart_generate",
  "arguments": { "model": "veo-3.1", "prompt": "a drone shot over a snowy ridge", "duration": 8, "resolution": "1080p", "generateAudio": true } }

Veo clips are chainable with gen-ai extend (+7s per segment). Full params for every Veo / Gemini / Imagen / Lyria model are in Parameters below.

Nano Banana Pro (image)

gen-ai generate -m gemini-3-pro-image -p "a cinematic product render of a smart speaker" --ar 16:9 -r 4K

Gemini TTS & Lyria (audio)

gen-ai generate -m gemini-2.5-pro-tts -p "Here is your daily briefing."   # speech
gen-ai generate -m lyria-3-pro -p "uplifting cinematic orchestral score"  # music

Parameters

Full parameter surface for every model, sourced from gen-ai models info <id> --json. CLI flags show the primary short form; the canonical --kebab-case long form always works too.

gemini-3.1-flash-image — Nano Banana 2

Try gemini-3.1-flash-image in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 3:2 · 2:3 · 4:5 · 5:4 · 4:1 · 1:4 · 8:1 · 1:8 · 21:9 (default 1:1)
resolution -r enum 0.5K · 1K · 2K · 4K (default 1K)
count -n enum 1 · 2 · 4 · 6 · 8 · 10 (default 1)
thinkingLevel --thinking enum minimal (Minimal (faster)) · high (High (more reasoning)) (default minimal)
imageUrls -i file image (up to 14)

veo-3.1 — Veo 3.1

Try veo-3.1 in Playground ↗

Input type: t2v

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 16:9 · 9:16 (default 16:9)
duration -d enum 4 · 6 · 8 (default 8)
resolution -r enum 720p · 1080p · 4k (default 720p)
imageUrls -i file image (up to 3)
generateAudio --audio-gen boolean true · false (default true)
negativePrompt --neg text free text
startFrame --start-frame file image
endFrame --end-frame file image

veo-3.1-fast — Veo 3.1 Fast

Try veo-3.1-fast in Playground ↗

Input type: t2v

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 16:9 · 9:16 (default 16:9)
duration -d enum 4 · 6 · 8 (default 8)
resolution -r enum 720p · 1080p · 4k (default 720p)
imageUrls -i file image (up to 3)
generateAudio --audio-gen boolean true · false (default true)
negativePrompt --neg text free text
startFrame --start-frame file image
endFrame --end-frame file image

gemini-3-pro-image — Nano Banana Pro

Try gemini-3-pro-image in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 2:3 · 21:9 (default 1:1)
resolution -r enum 1K · 2K · 4K (default 2K)
count -n enum 1 · 2 · 4 · 6 · 8 · 10 (default 1)
thinkingBudget --thinking-budget integer 12832768, step 128, default 128
imageUrls -i file image (up to 14)

gemini-2.5-flash-image — Nano Banana

Try gemini-2.5-flash-image in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 · 2:3 · 21:9 (default 16:9)
count -n enum 1 · 2 · 4 · 6 · 8 · 10 (default 1)
imageUrls -i file image (up to 14)

veo-3.1-lite — Veo 3.1 Lite

Try veo-3.1-lite in Playground ↗

Input type: t2v

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 16:9 · 9:16 (default 16:9)
duration -d enum 4 · 6 · 8 (default 8)
resolution -r enum 720p · 1080p (default 720p)
startFrame --start-frame file image

gemini-2.5-flash-tts — Gemini 2.5 Flash TTS

Try gemini-2.5-flash-tts in Playground ↗

Input type: tts

Param CLI flag Type Values
language --language text free text
accent --accent text free text
prompt -p text required (≤5000 chars)
voiceId --voice enum Aoede · Charon · Fenrir · Kore · Leda · Orus · Puck · Zephyr · Achernar · Achird · Algenib · Algieba · Alnilam · Autonoe · Despina · Enceladus · Erinome · Gacrux · Iapetus · Laomedeia · Pulcherrima · Rasalgethi · Sadachbia · Sadaltager · Schedar · Sulafat · Umbriel · Vindemiatrix · Zubenelgenubi (default Kore)

gemini-2.5-pro-tts — Gemini 2.5 Pro TTS

Try gemini-2.5-pro-tts in Playground ↗

Input type: tts

Param CLI flag Type Values
language --language text free text
accent --accent text free text
prompt -p text required (≤5000 chars)
voiceId --voice enum Aoede · Charon · Fenrir · Kore · Leda · Orus · Puck · Zephyr · Achernar · Achird · Algenib · Algieba · Alnilam · Autonoe · Despina · Enceladus · Erinome · Gacrux · Iapetus · Laomedeia · Pulcherrima · Rasalgethi · Sadachbia · Sadaltager · Schedar · Sulafat · Umbriel · Vindemiatrix · Zubenelgenubi (default Kore)

imagen-4.0 — Imagen 4.0

Try imagen-4.0 in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1)
count -n enum 1 · 2 · 4 (default 1)
enhancePrompt --enhance-prompt boolean true · false (default true)
negativePrompt --neg text free text

imagen-4.0-ultra — Imagen 4.0 Ultra

Try imagen-4.0-ultra in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1)
count -n enum 1 · 2 · 4 (default 1)
enhancePrompt --enhance-prompt boolean true · false (default true)
negativePrompt --neg text free text

imagen-4.0-fast — Imagen 4.0 Fast

Try imagen-4.0-fast in Playground ↗

Input type: t2i

Param CLI flag Type Values
prompt -p text required
aspectRatio --ar enum 1:1 · 16:9 · 9:16 · 3:4 · 4:3 (default 1:1)
count -n enum 1 · 2 · 4 (default 1)
enhancePrompt --enhance-prompt boolean true · false (default true)
negativePrompt --neg text free text

lyria-3-clip — Lyria 3 Clip

Try lyria-3-clip in Playground ↗

Input type: music

Param CLI flag Type Values
prompt -p text required
imageUrls -i file image (up to 1)

lyria-3-pro — Lyria 3 Pro

Try lyria-3-pro in Playground ↗

Input type: music

Param CLI flag Type Values
prompt -p text required
imageUrls -i file image (up to 1)

Notes: Veo audio is native (generateAudio); Imagen and Gemini image models differ in resolution and reasoning controls (thinkingLevel / thinkingBudget). TTS voiceId values are Gemini voice presets.

Clone this wiki locally