Skip to content

Gen3 Model Guides Orpheus V1 English

GT AI OS Release edited this page Jun 18, 2026 · 3 revisions

Orpheus v1 English — Vocal Directions

Start Here

  1. Confirm operators enabled Text-to-speech (TTS) on Control Panel → ModelsPlatform Multimodal Settings and set a default TTS model (baseline free-tier deployments use Groq canopylabs/orpheus-v1-english).
  2. On your agent → Speech Settings, select Orpheus v1 English (or the deployment default TTS model) and pick a voice when the dropdown lists Orpheus personas.
  3. In GT Chat, ask the agent to include bracketed vocal directions in the reply text when you want expressive delivery—for example: "Answer in two short sentences and prefix the spoken line with [cheerful]."
  4. After the assistant message finishes streaming, use Read aloud on that message (or enable Hands-Free Mode for automatic reply audio). GT AI OS synthesizes the visible message text, including any [direction] tokens you asked the agent to write.

Why this matters

Canopy Labs Orpheus v1 English is an expressive Groq text-to-speech model. It interprets vocal directions—short bracketed hints such as [cheerful], [whisper], or [professionally]—to control pacing, tone, and performance.

GT AI OS does not add those brackets automatically. Read aloud and Hands-Free reply audio send the assistant message body to the TTS provider as written. If the chat model omits bracketed directions, Orpheus speaks in a natural conversational cadence (which is often what you want for support-style agents).

Details

How vocal directions work (Groq / Orpheus)

Per Groq's Orpheus documentation:

Pattern Effect
No brackets Natural, conversational delivery—good for FAQs and everyday assistant replies
One or two word directions in [brackets] Subtle style control—examples: [cheerful], [whisper], [dramatically], [professionally]
Multiple directions in one line More expressive, acted delivery—use sparingly for character or narration

Best practices from the provider:

  • Prefer 1–2 word adjectives or adverbs inside brackets.
  • Omit directions entirely when you want neutral, human-like cadence.
  • Keep each TTS input under 200 characters on the Groq side (see Length limits below).
  • Use hyphens for spelled-out numbers when you need letter-by-letter reading (for example 2-0-3 instead of 203).

Using vocal directions in GT AI OS

GT Chat Read aloud (and Hands-Free reply playback) uses the agent's configured TTS model to synthesize message.content—the same text you see in the transcript. Markdown formatting is not spoken as markup; the synthesis path uses the message's text content.

Because the chat LLM generates that text, you control vocal directions by prompting the agent:

  1. Per message (recommended for trying directions): Before or with your question, tell the agent how to format spoken output. Example prompts:

    • "Reply in one sentence. Start the sentence with [whisper] then explain the backup window."
    • "Give a short status update. Use [professionally] at the beginning of the spoken line."
    • "For this answer only, write the final line as: [cheerful] followed by the summary—no other bracket tags."
  2. Agent system prompt (recommended for recurring tone): On Building AgentsSpeech Settings or the main instructions field, add guidance when this agent is used with TTS—for example: "When the user may use Read aloud, prefix concise spoken answers with one Orpheus vocal direction in square brackets (for example [cheerful] or [professionally]). Keep bracket tags to one or two words. Do not explain the bracket syntax unless asked."

  3. Verify before listening: Read the assistant message. If you do not see the [direction] tokens you expected, ask the agent to revise the reply with the brackets in the visible text before clicking Read aloud.

Important: Ask the agent to put bracketed directions in the message text you will hear. GT AI OS does not inject vocal directions between chat generation and TTS.

Speech configuration checklist

Layer Where What to set
Platform Control Panel → ModelsPlatform Multimodal Settings Text-to-speech (TTS) enabled
Defaults Control Panel → Default Models Default TTS → Orpheus v1 English (or your chosen TTS model)
Agent Agent editor → Speech Settings TTS model, voice, optional format
Chat GT Chat Read aloud on a completed assistant message, or Hands-Free Mode when TTS is available

See Accessibility for microphone permissions, Hands-Free behavior, and troubleshooting when controls are missing.

Length limits and long answers

Groq documents a 200-character maximum per Orpheus TTS request. GT AI OS may split longer message text into chunks for synthesis before sending audio to your browser. Very long Read aloud runs can fail or sound uneven if a chunk exceeds provider limits.

When you need reliable expressive delivery:

  • Ask the agent for a short spoken summary (one or two sentences) with directions inline.
  • Use Read aloud on that concise message rather than a long markdown report.
  • For long answers, read the text visually and use TTS only on a follow-up message you ask the agent to keep brief.

Example workflow

  1. Select an agent with Orpheus configured on Speech Settings.
  2. Send: "In one sentence, tell me whether backups finished overnight. Start with [cheerful]."
  3. Assistant replies: [cheerful] Backups completed successfully at 02:15 UTC with no errors.
  4. Click Read aloud on that message—the opening direction shapes Orpheus delivery.

For a neutral tone, omit step 2's direction request; Orpheus uses conversational cadence without brackets.

Related pages

Clone this wiki locally