-
Notifications
You must be signed in to change notification settings - Fork 0
isolate model inference configs #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| def _resolve_provider_config(self, config: dict[str, Any]) -> dict[str, Any]: | ||
| """Merge user config with defaults (user takes precedence).""" | ||
| # Extract voice from provider-specific speech_config.voice_config.prebuilt_voice_config.voice_name if present | ||
| provider_voice = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Voice is passed in through the "audio" config.
0580606 to
d469584
Compare
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Nova Sonic configuration constants | ||
| NOVA_INFERENCE_CONFIG = {"maxTokens": 1024, "topP": 0.9, "temperature": 0.7} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to explicitly provide defaults. Nova already has implicit defaults for these that we can rely on.
| def _resolve_provider_config(self, config: dict[str, Any]) -> dict[str, Any]: | ||
| """Merge user config with defaults (user takes precedence).""" | ||
| # Extract voice from provider-specific audio.output.voice if present | ||
| provider_voice = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Voice provided through "audio" config of type AudioConfig.
| "input_audio_format", | ||
| "output_audio_format", | ||
| "input_audio_transcription", | ||
| "turn_detection", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- type always has to be realtime and is already set by us.
- instructions is set by us through system prompt.
- voice is set by us through "audio" config.
- tools is set by us through the passed in
toolsparam. - input_audio_format, output_audio_format, input_audio_transcription, and turn_detection are not top-level configs and so would lead to exceptions if setting.
For more details on supported settings, see https://platform.openai.com/docs/api-reference/realtime-client-events/session/update#realtime_client_events-session-update-session.
d469584 to
8d1461c
Compare
| "max_tokens": "maxTokens", | ||
| "temperature": "temperature", | ||
| "top_p": "topP", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using to promote consistency. We use snake_case everywhere else.
| "input_rate": GEMINI_INPUT_SAMPLE_RATE, | ||
| "output_rate": GEMINI_OUTPUT_SAMPLE_RATE, | ||
| "channels": GEMINI_CHANNELS, | ||
| "format": "pcm", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should have a default voice here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does work without specifying. I tested that on all the models actually. With that said, we could remove the default voice setting on all configs but I didn't want to make too many changes here.
Description
Isolating the model inference configs to more easily extract them from
provider_configs.Testing
hatch run bidi:prepare: Updated unit tests