<a href="https://colab.research.google.com/github/lasquires/Transcription/blob/main/Transcribe.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Free Audio Transcriber**

**This tutorial is made to be understood by those without programming experience**
## 1. Save this notebook to your google drive
- While you can do this from your device, it's much easier to do it from a laptop if you're not familiar with colab
- Press `File -> Save a copy in Drive` so that you can actually upload the files
- Open the copy to continue

## 2. Start runtime
- Press `Runtime->Change runtime type->T4 GPU->save`

*If you are forced to use `CPU`, know that it will take longer and will be lower quality. Don't even try to use `large`*

## 3. Upload your audio file to the folder on the left
- **Click** the folder icon `🗀` to the left
- **Drag and drop** your audio file into the folder or press the upload button (paper-with-up-arrow icon)
- **Right click** the file you uploaded and press `Copy path`
- **Paste** the path in `file_path` *(make sure to include quotes around it!)*
- **Press** the 'play' button on the left of the code cell below

In [1]:
file_path = "/content/example.mp3"

## 4. Replace the model with what you want

You have several options. *The smaller the model, the faster it will be:*
- `"tiny"`  (fastest/least accurate)
- `"base"`
- `"small"` (recommended)
- `"medium"`
- `"large"` (slowest/not recommended)

**Optional:** If you want, you can add in the `context` of the audio as well as any key words/phrases. If not, you can leave it blank

Press the 'play' button

In [None]:
model = "small"
language = 'english'
context = 'love song about never giving up'

## 5. Start transcribing
- **Optional**: When you run the cell, a **Google Drive pop-up** will appear to link your account. If you would like the transcription to **AUTOMATICALLY SAVE** to your Drive, take the steps needed to connect your account. Later, you can **look for it under** `recent` **in Google Drive**
- **Press** the play button. *Expand cell and scroll down to see progress*




In [17]:
try: # Try to attach Google Drive
  from google.colab import drive
  drive.mount('/content/drive')
  drive_save_path = '/content/drive/My Drive/'
except Exception as e:
    drive_save_path = None

from IPython.utils.capture import capture_output
if 'installed' not in globals():
  with capture_output() as captured:
    !pip install -U openai-whisper --quiet
    !pip install git+https://github.com/openai/whisper.git --quiet
    !pip install setuptools-rust --quiet
    !pip install nltk --quiet
    !pip install pydub --quiet
    !apt-get install ffmpeg --quiet

    from nltk.tokenize import sent_tokenize
    import nltk
    import whisper
    import gdown
    from datetime import datetime
    from pydub import AudioSegment
    nltk.download('punkt')

    !gdown 1GwAd-1MekWG-MolZgPpOJiySklT9exxX -O example.mp3 # example file
    installed = True # So we don't need to try to install everytime someone runs cell

def Transcribe(path, model="base", drive_save_path = None, context=None, language = 'english', stats = False):
    # Adding right formatting to model name according to Whisper docs
    language = language.lower()
    if language == 'english' and model!='large': # (Large doesn't have a .en)
        model+='.en'

    # Extracts filename
    filename = path.split('/')[-1].rsplit('.', 1)[0]
    print(f"Loading model \'{model}\'...")
    model = whisper.load_model(model)
    print("DONE")

    if stats:
        # Load audio file and calculate its duration in seconds
        audio = AudioSegment.from_file(path)
        duration_seconds = len(audio) / 1000  # duration in seconds
        ds = duration_seconds
        duration_hours, remainder = divmod(ds, 3600)
        duration_minutes, ds = divmod(remainder, 60)
        print(f"Audio Duration: {duration_hours}hr:{duration_minutes}min:{ds%60:.0f}sec")
        t1 = datetime.now()

    # The real work done here. Thanks OpenAI!
    print("Transcribing...")
    result = model.transcribe(audio = path,
                              verbose=False,
                              initial_prompt = context,
                              )
    print('\n')
    if stats:
        t2 = datetime.now()
        # Calculate transcription time
        elapsed_time = t2 - t1
        hours, remainder = divmod(elapsed_time.seconds, 3600)
        minutes, seconds = divmod(remainder, 60)
        print(f"Time taken: {hours}hr:{minutes}min:{seconds}sec")

        # Calculate the transcription speed (minutes of audio per minute of real time)
        transcription_speed = duration_seconds / 60 / (elapsed_time.seconds / 60) if elapsed_time.seconds > 0 else 0
        print(f"Transcription speed: {transcription_speed:.2f} minutes of audio per minute of real time")

    # Converting the raw text into readable sentences
    print("Formating sentences...")
    sentences = sent_tokenize(text=result['text'], language=language)

    # Write the sentences with proper formatting to .txt file
    print("Saving file...")
    with open(f"{drive_save_path}text_{filename}.txt", "w") as out_file:
        for sentence in sentences:
            out_file.write(sentence)
            out_file.write("\n")
    print(f"Saved as \'{drive_save_path}text_{filename}.txt\'")

    return(sentences)

transcription = Transcribe(file_path, model, drive_save_path,context, language)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Loading model 'small.en'...
DONE
Transcribing...


100%|██████████| 20811/20811 [04:09<00:00, 83.36frames/s]



Formating sentences...
Saving file...
Saved as '/content/drive/My Drive/text_example.txt'





## 6. Download the new file to your device


Go to `Folder_icon`(`🗀`)-> `text_{your_file}.txt` -> `⋮` -> `Download`

**Note: If you selected to automatically save it to google drive, go to Google Drive -> recents!**




---

###If you do not see the file, press the **refresh-folder icon** ↻ to the right of the file-upload icon



In [18]:
# You can also view the text here:
for sentence in transcription:
  print(sentence)

 music We're no strangers to love You know the rules and so do I I feel commitments, what I'm thinking of You wouldn't get this from any other guy I just wanna tell you how I'm feeling Gotta make you understand Never gonna give you up Never gonna let you down Never gonna run around and desert you Never gonna make you cry Never gonna say goodbye Never gonna tell a lie and hurt you We've known each other for so long Your heart's been aching but you're too shy to say it Inside we both know what's been going on We know the game and we're gonna play it And if you ask me how I'm feeling Don't tell me you're too blind to see Never gonna give you up Never gonna let you down Never gonna run around and desert you Never gonna make you cry Never gonna say goodbye Never gonna tell a lie and hurt you Never gonna give you up Never gonna let you down Never gonna run around and desert you Never gonna make you cry We've known each other for so long Your heart's been aching but You're too shy to say it I

#####To see how this works, check out [Whisper's GitHub](https://github.com/openai/whisper)