<a href="https://colab.research.google.com/github/AndreDalwin/Whisper2Summarize/blob/main/Whisper2Summarize_Colab_Edition.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Whisper2Summarize: Colab Edition

This is a modified version of [Whisper2Summarize](https://github.com/AndreDalwin/Whisper2Summarize) that works in Google Colab.

To Begin, you need to initialize the requirements in order to run the program

In [None]:
pip install git+https://github.com/openai/whisper.git openai

The program then needs to be initialized. Ensure you place in your audio file name, select Whisper Model to use, and your OpenAI API Key (Don't worry. Google Colab doesn't save your apikey if you are in Playground Mode.)



In [None]:
import torch  
import whisper 
import openai
import tqdm
import sys

audio = "audio.mp3" #Make sure you upload the audio file (mp3,wav,m4a) into the session storage!
model = "base" #possible options are 'tiny', 'base', 'small', 'medium', and 'large'
apikey = "INSERT API KEY HERE"

Now, we will be setting up the transcribe() and gpt_process() functions.

In [None]:
def transcribe(audio,model_type):
    class _CustomProgressBar(tqdm.tqdm):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self._current = self.n  
            
        def update(self, n):
            super().update(n)
            self._current += n
            
            print("Audio Transcribe Progress: " + self._current +"/" +self.total)
            
    devices = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 
    model = whisper.load_model(model_type, device = devices)
    transcribe_module = sys.modules['whisper.transcribe']
    transcribe_module.tqdm.tqdm = _CustomProgressBar

    print("Beginning Transcribing Process...")
    result = model.transcribe(audio, verbose=None, fp16=False)
    transcribed = result["text"]
    with open("Transcript.txt", "w",encoding='utf-8') as text_file:
        text_file.write(transcribed)
        print("Saved Transcript to Transcript.txt")
    return transcribed

def gpt_process(transcript):
    openai.api_key = apikey
    print("Processing Transcript with GPT...")
    n=1300
    split = transcript.split()
    snippet= [' '.join(split[i:i+n]) for i in range(0,len(split),n)]
    ## For managing token limit
    summary=""
    previous=""
    for i in range(0, len(snippet), 1):
        print("Summarizing Transcribed Snippet {} of {}".format(i+1,len(snippet)))
        gpt_response = openai.ChatCompletion.create(
            model="gpt-3.5-turbo",
            messages=[{"role": "user", "content": "\"" + snippet[i] + "\"\n Rewrite the transcript above into notes. Do not summarize and keep every information. For additional context here is the previous rewritten message: \n " + previous }],
            temperature = 0.6,
        )
        previous = gpt_response['choices'][0]['message']['content']
        summary += gpt_response['choices'][0]['message']['content']

    with open("Summary.txt", "w",encoding='utf-8') as text_file:
        text_file.write(summary)
        print("Summarizing Completed.")
        print("Saved Summary to Summary.txt")


Last, we will run the code. This will output 2 files into the session storage. The Transcript.txt as well as a Summary.txt

In [None]:
text = transcribe(audio,model)
gpt_process(text)