Skip to content

Inworld: Invalid voice for context #5274

@joshiayush

Description

@joshiayush

Bug Description

Hi @ianbbqzy ,

We have created a voice clone in InWorld and tested it in playground and playground script provided by InWorld and there it worked. The problem we are facing is that inworld api is returning the following error when used inside LiveKit agent:

agent-remote-1  |                  WARNI… livekit.…s.inworld Received error from Inworld
agent-remote-1  |                                            {"context_id": "2fb8ff0d54af",
agent-remote-1  | "error_code": 5, "error_message": "Context 2fb8ff0d54af not found during
agent-remote-1  | closeContext", "context_state": "unknown", "context_known": false, "interview":
agent-remote-1  | "69a9692736b631a8c43e634f", "pid": 282, "job_id": "AJ_8ypRrogysSRL", "room_id":
agent-remote-1  | "RM_E8Bp3iuGzpmi"}
agent-remote-1  |     06:32:45.146 ERROR… livekit.agents     Error in _tts_inference_task
agent-remote-1  |                                            Traceback (most recent call last):
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 647, in __anext__
agent-remote-1  |                                                val = await
agent-remote-1  |                                            self._event_aiter.__anext__()
agent-remote-1  |                                                      ^^^^^^^^^^^^^^^^^^^^^^^^…
agent-remote-1  |                                            StopAsyncIteration
agent-remote-1  |
agent-remote-1  |                                            During handling of the above
agent-remote-1  |                                            exception, another exception
agent-remote-1  |                                            occurred:
agent-remote-1  |
agent-remote-1  |                                            Traceback (most recent call last):
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 17, in async_fn_logs
agent-remote-1  |                                                return await fn(*args,
agent-remote-1  |                                            **kwargs)
agent-remote-1  |                                                       ^^^^^^^^^^^^^^^^^^^^^^^…
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 324, in _tts_inference_task
agent-remote-1  |                                                pushed_duration += await
agent-remote-1  |                                            _tts_node_inference(input_segment,
agent-remote-1  |                                            pushed_duration)
agent-remote-1  |                                                                   ^^^^^^^^^^^…
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 71, in async_wrapper
agent-remote-1  |                                                return await func(*args,
agent-remote-1  |                                            **kwargs)  # type: ignore
agent-remote-1  |                                                       ^^^^^^^^^^^^^^^^^^^^^^^…
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 280, in _tts_node_inference
agent-remote-1  |                                                async for audio_frame in
agent-remote-1  |                                            tts_node:
agent-remote-1  |                                                ...<15 lines>...
agent-remote-1  |                                                    audio_duration +=
agent-remote-1  |                                            audio_frame.duration
agent-remote-1  |                                              File "/app/charlie/assistant.py",
agent-remote-1  |                                            line 119, in tts_node
agent-remote-1  |                                                async for frame in
agent-remote-1  |                                            Agent.default.tts_node(
agent-remote-1  |                                                ...<2 lines>...
agent-remote-1  |                                                    yield frame
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 490, in tts_node
agent-remote-1  |                                                async for ev in stream:
agent-remote-1  |                                                    yield ev.frame
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 650, in __anext__
agent-remote-1  |                                                raise exc  # noqa: B904
agent-remote-1  |                                                ^^^^^^^^^
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 429, in _traceable_main_task
agent-remote-1  |                                                await self._main_task()
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 469, in _main_task
agent-remote-1  |                                                await self._run(output_emitter)
agent-remote-1  |                                              File
agent-remote-1  |                                            "/app/.venv/lib/python3.13/site-pa…
agent-remote-1  |                                            line 1252, in _run
agent-remote-1  |                                                await asyncio.wait_for(waiter,
agent-remote-1  |                                            timeout=self._conn_options.timeout
agent-remote-1  |                                            + 60)
agent-remote-1  |                                              File
agent-remote-1  |                                            "/usr/local/lib/python3.13/asyncio…
agent-remote-1  |                                            line 507, in wait_for
agent-remote-1  |                                                return await fut
agent-remote-1  |                                                       ^^^^^^^^^
agent-remote-1  |                                            livekit.agents._exceptions.APIErro…
agent-remote-1  |                                            Inworld error: Invalid voice
agent-remote-1  |                                            default-_ad9jkgpt85qjyuwfzlepa__as…
agent-remote-1  |                                            for context: 2fb8ff0d54af
agent-remote-1  |                                          {"interview":
agent-remote-1  | "69a9692736b631a8c43e634f", "pid": 282, "job_id": "AJ_8ypRrogysSRL", "room_id":
agent-remote-1  | "RM_E8Bp3iuGzpmi"}
agent-remote-1  | Traceback (most recent call last):
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/tts/tts.py", line
agent-remote-1  | 647, in __anext__
agent-remote-1  |     val = await self._event_aiter.__anext__()
agent-remote-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
agent-remote-1  | StopAsyncIteration
agent-remote-1  |
agent-remote-1  | During handling of the above exception, another exception occurred:
agent-remote-1  |
agent-remote-1  | Traceback (most recent call last):
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/utils/log.py",
agent-remote-1  | line 17, in async_fn_logs
agent-remote-1  |     return await fn(*args, **kwargs)
agent-remote-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^
agent-remote-1  |   File
agent-remote-1  | "/app/.venv/lib/python3.13/site-packages/livekit/agents/voice/generation.py",
agent-remote-1  | line 324, in _tts_inference_task
agent-remote-1  |     pushed_duration += await _tts_node_inference(input_segment, pushed_duration)
agent-remote-1  |                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
agent-remote-1  |   File
agent-remote-1  | "/app/.venv/lib/python3.13/site-packages/opentelemetry/util/_decorator.py", line
agent-remote-1  | 71, in async_wrapper
agent-remote-1  |     return await func(*args, **kwargs)  # type: ignore
agent-remote-1  |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
agent-remote-1  |   File
agent-remote-1  | "/app/.venv/lib/python3.13/site-packages/livekit/agents/voice/generation.py",
agent-remote-1  | line 280, in _tts_node_inference
agent-remote-1  |     async for audio_frame in tts_node:
agent-remote-1  |     ...<15 lines>...
agent-remote-1  |         audio_duration += audio_frame.duration
agent-remote-1  |   File "/app/charlie/assistant.py", line 119, in tts_node
agent-remote-1  |     async for frame in Agent.default.tts_node(
agent-remote-1  |     ...<2 lines>...
agent-remote-1  |         yield frame
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/voice/agent.py",
agent-remote-1  | line 490, in tts_node
agent-remote-1  |     async for ev in stream:
agent-remote-1  |         yield ev.frame
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/tts/tts.py", line
agent-remote-1  | 650, in __anext__
agent-remote-1  |     raise exc  # noqa: B904
agent-remote-1  |     ^^^^^^^^^
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/tts/tts.py", line
agent-remote-1  | 429, in _traceable_main_task
agent-remote-1  |     await self._main_task()
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/agents/tts/tts.py", line
agent-remote-1  | 469, in _main_task
agent-remote-1  |     await self._run(output_emitter)
agent-remote-1  |   File "/app/.venv/lib/python3.13/site-packages/livekit/plugins/inworld/tts.py",
agent-remote-1  | line 1252, in _run
agent-remote-1  |     await asyncio.wait_for(waiter, timeout=self._conn_options.timeout + 60)
agent-remote-1  |   File "/usr/local/lib/python3.13/asyncio/tasks.py", line 507, in wait_for
agent-remote-1  |     return await fut
agent-remote-1  |            ^^^^^^^^^
agent-remote-1  | livekit.agents._exceptions.APIError: Inworld error: Invalid voice
agent-remote-1  | default-_ad9jkgpt85qjyuwfzlepa__ashish for context: 2fb8ff0d54af

The voice name is ashish and the workspace id is default-_ad9jkgpt85qjyuwfzlepa and the API key used is from that particular workspace.

Expected Behavior

I expect InWorld to generate audio bytes in LiveKit as it does in the following script:

import requests
import os

url = "https://api.inworld.ai/tts/v1/voice:stream"

headers = {
    "Authorization": "Basic Y2lY...Q==",
    "Content-Type": "application/json",
}

payload = {
    "text": "Hey there! Wanna know a secret? This is what most people miss. By changing just a few simple habits, you'll see a *huge* increase in your productivity.",
    "voice_id": "default-_ad9jkgpt85qjyuwfzlepa__ashish",
    "audio_config": {"audio_encoding": "MP3", "speaking_rate": 1},
    "temperature": 1,
    "model_id": "inworld-tts-1.5-mini",
}

with requests.post(url, json=payload, headers=headers, stream=True) as response:
    response.raise_for_status()
    for chunk in response.iter_lines(decode_unicode=True):
        if chunk:
            print(chunk)

This script prints the chunks returned from InWorld API.

Reproduction Steps

1. Clone a voice in InWorld
2. Use that voice in the following code snippet:

    return inworld.TTS(
        model=_settings.inworld_model,  # inworld-tts-1.5-mini
        voice=config.persona.tts.voice,  # default-_ad9jkgpt85qjyuwfzlepa__ashish
        text_normalization=_settings.inworld_text_normalization,  # ON
    )

Operating System

macOS, Ubuntu 24.04

Models Used

inworld-tts-1.5-mini

Package Versions

livekit-agents[inworld]~=1.5.1

Session/Room/Call IDs

No response

Proposed Solution

Additional Context

No response

Screenshots and Recordings

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions