Skip to content

creative_template storyboard: add audio variant (TTS / mix / master pattern) #4015

@bokelley

Description

@bokelley

Gap

The creative_template storyboard's build_creative step is hardcoded to display assets:

sample_request:
  creative_manifest:
    format_id: { agent_url: ..., id: 'display_300x250' }
    assets:
      image: { asset_type: 'image', url: '...hero-300x250.jpg', width: 300, height: 250 }
      headline: { asset_type: 'text', content: 'Summer Sale — 40% Off' }
      click_url: { asset_type: 'url', url: '...' }

This means audio creative-template platforms (AudioStack, ElevenLabs, Resemble, Speechmatics) cannot pass the storyboard — they don't accept image inputs or produce display tags. Audio adopters today validate via manual round-trip + the SDK's three-gate test contract, but lack the same compliance signal display/video adopters get from the storyboard grader.

Proposal

Add an audio variant phase to creative_template/index.yaml — same shape as the existing build_creative step, different assets:

- id: build_audio_creative
  task: build_creative
  sample_request:
    creative_manifest:
      format_id: { agent_url: ..., id: 'audio_30s' }
      assets:
        script: { asset_type: 'text', content: 'Built for the trail. Acme Outdoor — premium gear for every adventure.' }
        voice: { asset_type: 'text', content: 'narrator-warm' }
        click_through: { asset_type: 'url', url: '...' }

Test-kit additions (acme-outdoor.yaml):

assets:
  audio_script:
    content: \"Built for the trail. Acme Outdoor — premium gear for every adventure.\"
  audio_voice:
    content: \"narrator-warm\"

The audio variant should:

  1. Be optional (skip when the agent doesn't declare audio support in list_creative_formats)
  2. Accept either sync return (small platforms render fast) OR task-envelope return (TTS pipelines run minutes — exercise the polling path)
  3. Validate that the response carries an audio asset in creative_manifest.assets with asset_type: 'audio' and a resolvable url

Why now

@adcp/sdk shipped a worked audio-template path in PR #1496 follow-up (audio template seeded in mock-server/creative-template, audio_url output projected via audioAsset() in hello_creative_adapter_template.ts). The pattern is in production-ready code; the storyboard is the last gap.

Reference implementation

Adopter audience

Major audio-creative platforms in scope:

  • AudioStack
  • ElevenLabs (voice generation)
  • Resemble.ai (voice cloning)
  • Speechmatics (transcription + voice)
  • Adobe Podcast (audio enhancement + voice)
  • WellSaid Labs

All would benefit from a storyboard signal aligned with display/video.

Metadata

Metadata

Assignees

No one assigned

    Labels

    claude-triagedIssue has been triaged by the Claude Code triage routine. Remove to re-triage.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions