# Install dependencies for Whisper

In [57]:
pip install -U openai-whisper
pip install setuptools-rust

You should consider upgrading via the '/Users/bear/.pyenv/versions/3.9.14/bin/python3.9 -m pip install --upgrade pip' command.[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


# (Optional) Installing latest version of Pytorch for M1 Mac
Recent release added support for M1's GPU


In [5]:
pip install --pre --force-reinstall torch --index-url https://download.pytorch.org/whl/nightly/cpu

Looking in indexes: https://download.pytorch.org/whl/nightly/cpu
Collecting torch
  Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.0.0.dev20230220-cp39-none-macosx_11_0_arm64.whl (56.0 MB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.0/56.0 MB[0m [31m11.8 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m[36m0:00:01[0m
[?25hCollecting filelock
  Downloading https://download.pytorch.org/whl/nightly/filelock-3.9.0-py3-none-any.whl (9.7 kB)
Collecting typing-extensions
  Downloading https://download.pytorch.org/whl/nightly/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Collecting sympy
  Downloading https://download.pytorch.org/whl/nightly/sympy-1.11.1-py3-none-any.whl (6.5 MB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.5/6.5 MB[0m [31m10.4 MB/s[0m eta [36m0:00:00[0m MB/s[0m eta [36m0:00:01[0m:01[0m
[?25hCollecting networkx
  Downloading https://download.pytorch.org/whl/nightly/networ

### SImple usage of Whisper

In [7]:
import whisper

model = whisper.load_model("small")
result = model.transcribe("process_audio.mp3")
print(result["text"])

100%|███████████████████████████████████████| 461M/461M [00:47<00:00, 10.1MiB/s]


 Hey I'm just gonna record this so when I get up in the morning I like to brush my teeth and then I'll take Bear out to potty but before that I like to make a pot of coffee and I use about 30 grams of ground Colombian coffee to then brew my coffee. After that I'm gonna go into my office and open up my computer and then I check my email and from my email I then move to my tickets to just kind of update them and get them going. Yeah that's about it.


You can add additional files to the playbook directory and run them through this example. Just replace `"process_audio.wav"`

### Testing Glob

In [14]:
import glob
glob.glob("process_*.*")

['process_audio.mp3',
 'process_audio.wav',
 'process_audio.flac',
 'process_audio.m4a']

Here we are ensuring glob is going to find our files.

### Benchmark Section

In [None]:
import timeit
import whisper
import glob
print("Timeit running...")
for x in glob.glob("process_*.*"):
    result = timeit.timeit(f'import whisper; model = whisper.load_model("base.en"); output = model.transcribe("{x}", fp16=False)', number=20)
    print(x, result)

In the section below we are using `timeit`, `whisper` and `glob` for a small and simple benchmark.
`timeit` will run our source 20 times for each time we loop through the collection of files `glob` has found. Since timeit take a string of the source we want to run, we can easily use string formatting to "inject" each file from `glob` to be transcribed. 

Whisper checks to see if it can use fp16(floating point 16) or fp32(floating point 32). If fp16 is not available Whisper displays a warning each time we call it... Could get a little annoying so I've added `fp16=False` for my setup to suppress warnings.

### Chunking the audio file

In [58]:
pip install pydub

You should consider upgrading via the '/Users/bear/.pyenv/versions/3.9.14/bin/python3.9 -m pip install --upgrade pip' command.[0m[33m
[0mNote: you may need to restart the kernel to use updated packages.


In [39]:
from pydub import AudioSegment
from pydub.silence import split_on_silence

def split(filepath):
    sound = AudioSegment.from_wav(filepath)
    return split_on_silence(
        sound,
        min_silence_len = 500,
        silence_thresh = sound.dBFS - 16,
        keep_silence = 250, # optional
    )

chunks = split("process_audio.wav")

for i, chunk in enumerate(chunks):
    print(f"exporting chunk {i}")
    chunk.export(
        f"chunk_{i}.mp3",
        bitrate = "192k",
        format = "mp3"
    )
    

exporting chunk 0
exporting chunk 1
exporting chunk 2
exporting chunk 3
exporting chunk 4
exporting chunk 5
exporting chunk 6
exporting chunk 7
exporting chunk 8
exporting chunk 9
exporting chunk 10
exporting chunk 11
exporting chunk 12
exporting chunk 13


In this section we're going to break the audio file in to chunk based on sections of silence. We'll need pydub, a library used for audio manipulation. 

Here we have a function `split` that takes in a file name. It creates an audio segment from the file and returns the chunks. 


In `split_on_silence` it looks for a minimum length of silence to split up the audio file. `silence_thresh` is a threshold of silence is defined as. If its within the range defined it will be considered silence. The call to `sound.dBFS` is the maximum possible loudness the sound could be - 16.


`keep_silence` is the amount of time in milliseconds that will be added to the split audio file in the beginning and end. We'll want to use this to make sure we don't miss anything.


So we split the audio file. Our final step with chunking is exporting our newly created chunks. In this block it loops over the chunks, pull one off each iteration. It prints a message letting us know whats happening. The files is names chunk_X.mp3 with X being the `i` or the index of our loop. Bitrate is set to 192k, which is an OK bitrate for an mp3. And ofcourse it exports it as an mp3.

### What is Bitrate
Bitrate refers to the amount of information in an audio file per second. It tells you how much detail and quality you can expect from the sound. A higher bit rate means more information and better sound quality, while a lower bit rate means less information and lower quality sound.

### Running the Chunks through Whisper!

In [56]:
import re
model = whisper.load_model("base.en")
chunks = glob.glob("chunk_*.mp3")
for chunk in sorted(chunks, key=lambda s: int(re.search(r'\d+', s).group())):
    results = model.transcribe(chunk, fp16=False)
    print(chunk, results["text"])
    

chunk_0.mp3  Hey, I'm just going to record this. So when I get up in the morning, I like to brush my teeth and then
chunk_1.mp3  I'll take bear out to potty.
chunk_2.mp3  But before that I like to make a pot of coffee
chunk_3.mp3  um
chunk_4.mp3  And I use about 30 grams.
chunk_5.mp3  of ground Colombian coffee.
chunk_6.mp3  to then brew my coffee.
chunk_7.mp3  After that
chunk_8.mp3  I'm gonna go into my office.
chunk_9.mp3  and open up my computer and then I check my email.
chunk_10.mp3  and from my email
chunk_11.mp3  I then move to my tickets to just kind of update them and get them going
chunk_12.mp3  Um
chunk_13.mp3  Yeah, that's about it.


Ahhhh we've finally made it to this point! Now we take our chunks and feed them to Whisper. Sorted out of the box does a decent job sorting our files name returned from `glob` but it needs a little help. `key` takes a function or a callable to then call that function on each element in the list prior to comparing for the sort. 

Example: If the string is lowercase it will be placed befor an uppercase word.
```python
>> sorted("This is a test string from Andrew".split(), key=str.lower)
['a', 'Andrew', 'from', 'is', 'string', 'test', 'This']
```

```python
lambda s: int(re.search(r'\d+', s).group())
```
lambda is a short hand function for python. `s` is the parameter like name in `def yell(name)`. So `s` will be the name of the file ex `chunk_0.mp3`. It uses regex to search the input value of `s`. `re.search(r'\d+',s).group()` searches for the digit in the file name and returns it at an int. Sorted will use that result and place the item in the proper order.

Heres a little example to help break down the regex
```python
import re
l = ["chunk_0.mp3", "chunk_10.mp3", "chunk_5.mp3"]
for x in l:
  y = int(re.search(r'\d+', x).group())
  print(y)
```
Output:
```
0
10
5
```

Once we have the chunks sorted its just a matter running the chunk through Whisper.

**Results**
```
chunk_0.mp3  Hey, I'm just going to record this. So when I get up in the morning, I like to brush my teeth and then
chunk_1.mp3  I'll take bear out to potty.
chunk_2.mp3  But before that I like to make a pot of coffee
chunk_3.mp3  um
```
