Skip to content

Transcribe audio using AssemblyAI with Semantic Kernel plugins.

License

Notifications You must be signed in to change notification settings

AssemblyAI/assemblyai-semantic-kernel

Repository files navigation

AssemblyAI logo


Nuget (with prereleases) CI Build GitHub License AssemblyAI Twitter AssemblyAI YouTube Discord

AssemblyAI integration for Semantic Kernel

Transcribe audio using AssemblyAI with Semantic Kernel plugins.

Get started

Add the AssemblyAI.SemanticKernel NuGet package to your project.

dotnet add package AssemblyAI.SemanticKernel

Next, register the AssemblyAI plugin into your kernel:

using AssemblyAI.SemanticKernel;
using Microsoft.SemanticKernel;

// Build your kernel
var kernelBuilder = Kernel.CreateBuilder();

// add services like LLMs etc.

// Get AssemblyAI API key from env variables, or much better, from .NET configuration
string apiKey = Environment.GetEnvironmentVariable("ASSEMBLYAI_API_KEY")
  ?? throw new Exception("ASSEMBLYAI_API_KEY env variable not configured.");
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
    {
        ApiKey = apiKey,
        PluginName = null,
        AllowFileSystemAccess = false
    });

var kernel = kernelBuilder.Build();

You can configure three options:

  • ApiKey: Configure the AssemblyAI API key
  • PluginName: Configure the name of the plugin inside of Semantic Kernel. Defaults to "AssemblyAIPlugin".
  • AllowFileSystemAccess: Allow the plugin to read files from the file system to upload audio files for transcriptions. Defaults to false.

kernelBuilder.AddAssemblyAIPlugin has overloads to configure the plugin using configuration and through a lambda.

Usage

Get the Transcribe function from the transcript plugin and invoke it with the context variables.

var result = await kernel.InvokeAsync<string>(
    nameof(AssemblyAIPlugin),
    AssemblyAIPlugin.TranscribeFunctionName,
    new KernelArguments
    {
        ["INPUT"] = "https://storage.googleapis.com/aai-docs-samples/espn.m4a"
    }
);
Console.WriteLine(result);

You can also upload local audio and video file. To do this:

  • Set the AssemblyAIPluginOptions.AllowFileSystemAccess to true.
  • Configure the INPUT variable with a local file path.
kernelBuilder.AddAssemblyAIPlugin(new AssemblyAIPluginOptions
    {
        ApiKey = apiKey,
        AllowFileSystemAccess = true
    });

...

var result = await kernel.InvokeAsync<string>(
    nameof(AssemblyAIPlugin), 
    AssemblyAIPlugin.TranscribeFunctionName, 
    new KernelArguments
    {
        ["INPUT"] = "./espn.m4a"
    }
);
Console.WriteLine(result);

You can also invoke the function from within a semantic function like this.

const string prompt = """
                      Here is a transcript:
                      {{AssemblyAIPlugin.Transcribe "https://storage.googleapis.com/aai-docs-samples/espn.m4a"}}
                      ---
                      Summarize the transcript.
                      """;
var result = await kernel.InvokePromptAsync<string>(prompt);
Console.WriteLine(result);

All the code above explicitly invokes the transcript plugin, but it can also be invoked as part of a plan. Check out the Sample project) which uses a plan to transcribe an audio file in addition to explicit invocation.

Notes

  • The AssemblyAI integration only supports Semantic Kernel with .NET at this moment. If there's demand, we will extend support to other platforms, so let us know!
  • Feel free to file an issue in case of bugs or feature requests.