Skip to content

message="Unknown parameter: 'session.output_modalities' While using gpt realtime api #2638

@ANYMS-A

Description

@ANYMS-A

Confirm this is an issue with the Python library and not an underlying OpenAI API

  • This is an issue with the Python library

Describe the bug

Hey, I am using openai==1.107.3 . Currently following this example script to build realtime api chatbot https://github.com/openai/openai-python/blob/main/examples/realtime/azure_realtime.py

In the example code line#34 and line#36 there are two param will lead to a ErrorEvent
"output_modalities": ["text"],
"type": "realtime",

If set any of those two param in the session param, I will get:

RealtimeErrorEvent(error=RealtimeError(message="Unknown parameter: 'session.output_modalities'.", type='invalid_request_error', code='unknown_parameter', event_id=None, param='session.output_modalities'), event_id='event_CGz0MgUoIe6epd4eHp4C2', type='error')

BTW, after checking the RealtimeSessionCreateRequestParam implementation in .venv\Lib\site-packages\openai\types\realtime\realtime_session_create_request_param.py. I thought the correct way to set the voice is:

{
    "audio": {"output": "marin"}
}

But this will lead to the same RealtimeErrorEvent

RealtimeErrorEvent(error=RealtimeError(message="Unknown parameter: 'session.audio'.", type='invalid_request_error', code='unknown_parameter', event_id=None, param='session.audio'), event_id='event_CGz9MD6lksIcT4iljdr1r', type='error')

If I set voice directly, it works
{
"voice": "marin"
}

To Reproduce

using openai openai==1.107.3
create your async azure client and run the code below

Code snippets

import os
import asyncio


from loguru import logger
from openai import AsyncAzureOpenAI
from azure_service_man.azure_service import AzureService


async def main() -> None:
    """The following example demonstrates how to configure Azure OpenAI to use the Realtime API.
    For an audio example, see push_to_talk_app.py and update the client and model parameter accordingly.

    When prompted for user input, type a message and hit enter to send it to the model.
    Enter "q" to quit the conversation.
    """

    az_service = AzureService(api_version="2024-10-01-preview")
    client: AsyncAzureOpenAI = az_service.client
    async with client.realtime.connect(
        model="GenAI_SG_AT_NonProdUSE201_gpt-realtime",  # deployment name for your model
    ) as connection:
        await connection.session.update(
            session={
                "output_modalities": ["text"],
                "model": "gpt-realtime",
                "type": "realtime",
                "instructions": "Speak with Singapore accent.",
                "audio": {"output": {"voice": "marin"}},
            }
        )
        async for event in connection:
            logger.info(event)

        while True:
            user_input = input("Enter a message: ")
            if user_input == "q":
                break

            await connection.conversation.item.create(
                item={
                    "type": "message",
                    "role": "user",
                    "content": [{"type": "input_text", "text": user_input}],
                }
            )
            await connection.response.create()
            async for event in connection:
                if event.type == "response.output_text.delta":
                    print(event.delta, flush=True, end="")
                elif event.type == "response.output_text.done":
                    print()
                elif event.type == "response.done":
                    break

    await credential.close()


asyncio.run(main())

OS

Windows

Python version

3.13.2

Library version

v1.107.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions