# Enhancing Whisper transcriptions: pre- & post-processing techniques

This notebook offers a guide to improve the Whisper's transcriptions. We'll streamline your audio data via trimming and segmentation, enhancing Whisper's transcription quality. After transcriptions, we'll refine the output by adding punctuation, adjusting product terminology (e.g., 'five two nine' to '529'), and mitigating Unicode issues. These strategies will help improve the clarity of your transcriptions, but remember, customization based on your unique use-case may be beneficial.

## Installation
Install the Azure Open AI SDK using the below command.

In [1]:
#r "nuget: Azure.AI.OpenAI, *-*"

In [2]:
#i "nuget:https://pkgs.dev.azure.com/dnceng/public/_packaging/dotnet-tools/nuget/v3/index.json"

In [3]:
#r "nuget:Microsoft.DotNet.Interactive.AIUtilities, *-*"

using Microsoft.DotNet.Interactive;
using Microsoft.DotNet.Interactive.AIUtilities;

In [4]:
var azureOpenAIKey = await Kernel.GetPasswordAsync("Provide your OPEN_AI_KEY");

// Your endpoint should look like the following https://YOUR_OPEN_AI_RESOURCE_NAME.openai.azure.com/
var azureOpenAIEndpoint = await Kernel.GetInputAsync("Provide the OPEN_AI_ENDPOINT");

// Enter the deployment name you chose when you deployed the model.
var deployment = await Kernel.GetInputAsync("Provide deployment name");

### Import namesapaces and create an instance of `OpenAiClient` using the `azureOpenAIEndpoint` and the `azureOpenAIKey`

In [4]:
using Azure;
using Azure.AI.OpenAI;

In [6]:
OpenAIClient client = new (new Uri(azureOpenAIEndpoint), new AzureKeyCredential(azureOpenAIKey.GetClearTextPassword()));

## Setup
To get started let's import a few different libraries:

 - [Naudio](https://github.com/naudio/NAudio) is a simple and easy-to-use library for audio processing tasks such as slicing, concatenating, and exporting audio files.

 - For our audio file, we'll use a fictional earnings call written by ChatGPT and read aloud by the author.This audio file is relatively short, but hopefully provides you with an illustrative idea of how these pre and post processing steps can be applied to any audio file.

In [4]:
using System.Net.Http;
using System.IO;

// set download paths
var earningsCallUrl = "https://cdn.openai.com/API/examples/data/EarningsCall.wav";

//set local save locations
var earningsCallFilepath = "./EarningsCall.wav";

// download the file
var httpClient = new HttpClient();
using (var stream = await httpClient.GetStreamAsync("https://via.placeholder.com/300.png"))
{
    using (var fileStream = new FileStream(earningsCallFilepath, FileMode.CreateNew))
    {
        await stream.CopyToAsync(fileStream);
    }
}

In [1]:
#r "nuget: NAudio, 2.2.1"