Skip to content

.Net: Bring Support for Azure OpenAI gpt-4o audio responses #11720

Closed
@Cobra86

Description

@Cobra86

Describe the bug
When using GPT-4o audio-preview with Semantic Kernel, the audio response isn't return The code runs without errors, but no sound is returned from the AI response.

To Reproduce
Steps to reproduce the behavior:

  1. Create a new .NET 9 project
  2. Install Microsoft.SemanticKernel v1.47.0
  3. Configure SDK with Azure OpenAI and the gpt-4o-audio-preview model
  4. Set up audio input from microphone using NAudio
  5. Request both text and audio responses with ChatResponseModalities.Text | ChatResponseModalities.Audio
  6. No Audio response and only text.

Expected behavior
The AI should respond both with text (which works) and audio (which doesn't return).

Screenshots
not applicable.

Platform

  • Language: C#
  • Source: NuGet package Microsoft.SemanticKernel version 1.47.0
  • AI model: Azure OpenAI gpt-4o-audio-preview
  • IDE: Visual Studio
  • OS: Windows

Additional context
I followed the example from https://devblogs.microsoft.com/semantic-kernel/using-openais-audio-preview-model-with-semantic-kernel/ and implemented microphone input using NAudio. The text response works correctly.

Metadata

Metadata

Assignees

Labels

.NETIssue or Pull requests regarding .NET codeai connectorAnything related to AI connectors

Projects

Status

Backlog: Planned

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions