isolate model inference configs #100

pgrayy · 2025-12-02T01:09:16Z

Description

Isolating the model inference configs to more easily extract them from provider_configs.

Testing

I ran hatch run bidi:prepare: Updated unit tests
I ran the following test script:

import asyncio
import json

from strands.experimental.bidi import BidiAgent
from strands.experimental.bidi.io import BidiAudioIO, BidiTextIO
from strands.experimental.bidi.models import BidiNovaSonicModel
from strands.experimental.bidi.models.gemini_live import BidiGeminiLiveModel
from strands.experimental.bidi.models.openai_realtime import BidiOpenAIRealtimeModel
from strands.experimental.bidi.tools import stop_conversation

from strands_tools import calculator


async def main() -> None:
    model = BidiNovaSonicModel(
        model_id="amazon.nova-sonic-v1:0",
        provider_config={
            "audio": {
                "voice": "matthew",
            },
            "inference": {
                "max_tokens": 600,
            },
        },
        client_config={"region": "us-east-1"},
    )
    # model = BidiGeminiLiveModel(
    #     model_id="gemini-2.5-flash-native-audio-preview-09-2025",
    #     provider_config={
    #         "audio": {
    #             "voice": "Charon",
    #         },
    #         "inference": {
    #             "max_output_tokens": 600,
    #         },
    #     },
    #     client_config={"api_key": "..."},
    # )
    # model = BidiOpenAIRealtimeModel(
    #     model_id="gpt-realtime",
    #     provider_config={
    #         "audio": {
    #             "voice": "coral",
    #         },
    #         "inference": {
    #             "max_output_tokens": 700

    #         },
    #     },
    #     client_config={"api_key": "..."},
    # )
    agent = BidiAgent(model=model, tools=[calculator, stop_conversation])

    audio_io = BidiAudioIO()
    text_io = BidiTextIO()
    await agent.run(inputs=[audio_io.input()], outputs=[audio_io.output(), text_io.output()])

    print(f"MAIN - stopping agent: {json.dumps(agent.messages, indent=2)}")


if __name__ == "__main__":
    asyncio.run(main())

Model responded as expected.
Toggling the max tokens did affect out size.
Stop conversation tool worked as expected.
Interruption worked.

pgrayy · 2025-12-02T01:09:48Z

src/strands/experimental/bidi/models/gemini_live.py

    def _resolve_provider_config(self, config: dict[str, Any]) -> dict[str, Any]:
        """Merge user config with defaults (user takes precedence)."""
-        # Extract voice from provider-specific speech_config.voice_config.prebuilt_voice_config.voice_name if present
-        provider_voice = None


Voice is passed in through the "audio" config.

pgrayy · 2025-12-02T01:11:24Z

src/strands/experimental/bidi/models/nova_sonic.py

 logger = logging.getLogger(__name__)

-# Nova Sonic configuration constants
-NOVA_INFERENCE_CONFIG = {"maxTokens": 1024, "topP": 0.9, "temperature": 0.7}


No need to explicitly provide defaults. Nova already has implicit defaults for these that we can rely on.

pgrayy · 2025-12-02T01:11:55Z

src/strands/experimental/bidi/models/openai_realtime.py

    def _resolve_provider_config(self, config: dict[str, Any]) -> dict[str, Any]:
        """Merge user config with defaults (user takes precedence)."""
-        # Extract voice from provider-specific audio.output.voice if present
-        provider_voice = None


Voice provided through "audio" config of type AudioConfig.

pgrayy · 2025-12-02T01:14:57Z

src/strands/experimental/bidi/models/openai_realtime.py

-            "input_audio_format",
-            "output_audio_format",
-            "input_audio_transcription",
-            "turn_detection",


type always has to be realtime and is already set by us.

instructions is set by us through system prompt.

voice is set by us through "audio" config.

tools is set by us through the passed in tools param.

input_audio_format, output_audio_format, input_audio_transcription, and turn_detection are not top-level configs and so would lead to exceptions if setting.

For more details on supported settings, see https://platform.openai.com/docs/api-reference/realtime-client-events/session/update#realtime_client_events-session-update-session.

pgrayy · 2025-12-02T01:26:58Z

src/strands/experimental/bidi/models/nova_sonic.py

+    "max_tokens": "maxTokens",
+    "temperature": "temperature",
+    "top_p": "topP",
+}


Using to promote consistency. We use snake_case everywhere else.

mehtarac · 2025-12-02T01:30:35Z

src/strands/experimental/bidi/models/gemini_live.py

            "input_rate": GEMINI_INPUT_SAMPLE_RATE,
            "output_rate": GEMINI_OUTPUT_SAMPLE_RATE,
            "channels": GEMINI_CHANNELS,
            "format": "pcm",


i think we should have a default voice here

It does work without specifying. I tested that on all the models actually. With that said, we could remove the default voice setting on all configs but I didn't want to make too many changes here.

github-actions bot added the size/m label Dec 2, 2025

pgrayy had a problem deploying to auto-approve December 2, 2025 01:09 — with GitHub Actions Failure

pgrayy commented Dec 2, 2025

View reviewed changes

pgrayy force-pushed the model-inference-config branch from 0580606 to d469584 Compare December 2, 2025 01:10

github-actions bot removed the size/m label Dec 2, 2025

pgrayy had a problem deploying to auto-approve December 2, 2025 01:11 — with GitHub Actions Failure

github-actions bot added the size/m label Dec 2, 2025

pgrayy commented Dec 2, 2025

View reviewed changes

pgrayy marked this pull request as ready for review December 2, 2025 01:15

isolate model inference configs

8d1461c

pgrayy force-pushed the model-inference-config branch from d469584 to 8d1461c Compare December 2, 2025 01:25

github-actions bot added size/m and removed size/m labels Dec 2, 2025

pgrayy had a problem deploying to auto-approve December 2, 2025 01:25 — with GitHub Actions Failure

pgrayy commented Dec 2, 2025

View reviewed changes

mehtarac reviewed Dec 2, 2025

View reviewed changes

mehtarac approved these changes Dec 2, 2025

View reviewed changes

pgrayy merged commit a46828d into main Dec 2, 2025
11 of 13 checks passed

pgrayy deleted the model-inference-config branch December 2, 2025 01:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

isolate model inference configs #100

isolate model inference configs #100

pgrayy commented Dec 2, 2025 •

edited

Loading

Uh oh!

pgrayy Dec 2, 2025

Uh oh!

pgrayy Dec 2, 2025

Uh oh!

pgrayy Dec 2, 2025 •

edited

Loading

Uh oh!

pgrayy Dec 2, 2025

Uh oh!

pgrayy Dec 2, 2025

Uh oh!

mehtarac Dec 2, 2025

Uh oh!

pgrayy Dec 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

isolate model inference configs #100

isolate model inference configs #100

Conversation

pgrayy commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Uh oh!

pgrayy Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

pgrayy Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

pgrayy Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pgrayy Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

pgrayy Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mehtarac Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

pgrayy Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pgrayy commented Dec 2, 2025 •

edited

Loading

pgrayy Dec 2, 2025 •

edited

Loading

pgrayy Dec 2, 2025 •

edited

Loading