<a href="https://colab.research.google.com/github/zsj1zsj/OpenAI_Whisper_ASR_Demo/blob/main/OpenAI_Whisper_ASR_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Web App Demonstrating OpenAI's Whisper Speech Recognition Model

This is a Colab notebook that allows you to record or upload audio files to [OpenAI's free Whisper speech recognition model](https://openai.com/blog/whisper/). This was based on [an original notebook by @amrrs](https://github.com/amrrs/openai-whisper-webapp), with added documentation and test files by [Pete Warden](https://twitter.com/petewarden).

To use it, choose `Runtime->Run All` from the Colab menu. If you're viewing this notebook on GitHub, follow [this link](https://colab.research.google.com/github/petewarden/openai-whisper-webapp/blob/main/OpenAI_Whisper_ASR_Demo.ipynb) to open it in Colab first. After about a minute or so, you should see a button at the bottom of the page with a `Record from microphone` link. Click this, you'll be asked to give permission to access your mic, and then speak for up to 30 seconds. Once you're done, press `Stop recording`, and a transcript of the first 30 seconds of your speech should soon appear in the box to the right of the recording button. To transcribe more speech, click `Clear' in the left box and start over.

You can also upload your own audio samples using the folder icon on the left of this page. That gives you access to a file system you can upload to by dragging files into it. You can see examples of how to run the transcription in a couple of the cells below.

## Install the Whisper Code

In [1]:
! pip install git+https://github.com/openai/whisper.git -q

[K     |████████████████████████████████| 5.3 MB 24.2 MB/s 
[K     |████████████████████████████████| 163 kB 48.7 MB/s 
[K     |████████████████████████████████| 7.6 MB 57.7 MB/s 
[?25h  Building wheel for whisper (setup.py) ... [?25l[?25hdone


## Load the ML Model

In [2]:
import whisper

model = whisper.load_model("large")


100%|█████████████████████████████████████| 2.87G/2.87G [00:53<00:00, 57.3MiB/s]


## Check we have a GPU

You should see the output `device(type='cuda', index=0)` below. If you don't, you may be on a CPU-only Colab instance which will run more slowly. Go to `Runtime->Change Runtime Type` to fix this.

In [3]:
model.device

device(type='cuda', index=0)

In [4]:
!pip install ipython-autotime
%load_ext autotime

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting ipython-autotime
  Downloading ipython_autotime-0.3.1-py2.py3-none-any.whl (6.8 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 30.0 MB/s 
Installing collected packages: jedi, ipython-autotime
Successfully installed ipython-autotime-0.3.1 jedi-0.18.1
time: 560 µs (started: 2022-10-29 09:59:47 +00:00)


In [6]:
!mkdir ming
!mkdir srt

# from google.colab import drive
# drive.mount('/content/drive')

import os
from datetime import datetime
import glob



def sortfilebysize(dir_name):
  list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '*') )
  list_of_files = sorted( list_of_files,
                        key =  lambda x: os.stat(x).st_size)
  return list_of_files

sortfilebysize('/content/ming/')

mkdir: cannot create directory ‘ming’: File exists
mkdir: cannot create directory ‘srt’: File exists


['/content/ming/36-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第二節 (廣東話).mp3',
 '/content/ming/40-【迷離公路】ep157 韓國三豐百貨靈異傳聞 第一節 (廣東話).mp3',
 '/content/ming/39-【迷離公路】ep158 解構都市傳說 男朋友之死 第二節 (廣東話).mp3',
 '/content/ming/35-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第一節 (廣東話).mp3',
 '/content/ming/38-【迷離公路】ep158 解構都市傳說 男朋友之死 第一節 (廣東話).mp3',
 "/content/ming/32-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第一節 (廣東話).mp3",
 "/content/ming/33-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第二節 (廣東話).mp3",
 '/content/ming/31-【迷離公路】港澳台靈異傳聞合集 (廣東話).mp3',
 '/content/ming/37-【迷離公路】日本靈異傳聞傳說 重製版 (廣東話).mp3',
 '/content/ming/34-【迷離公路】佛迪五夜驚魂系列合集 (廣東話).mp3']

time: 521 ms (started: 2022-10-29 10:00:16 +00:00)


In [7]:
# !pip install tqdm
import tqdm
import glob
from datetime import datetime, timezone, timedelta

tz = timezone(timedelta(hours=+8))
now = datetime.now(tz)
nowstr = now.strftime("%H:%M")

# files = sorted(glob.glob('/content/ming/*'))
files = sortfilebysize('/content/ming/')

pbar = tqdm.tqdm(files)
for file in pbar:
    now = datetime.now(tz)
    nowstr = now.strftime("%H:%M")
    pbar.set_description(nowstr+' '+file[14:-8])

    test_txt = model.transcribe(file)

    srtPath = '/content/srt/'+ file[14:] + '.txt'
    with open(srtPath, mode = 'w', encoding = 'utf-8') as f: 
      f.write(test_txt['text'])

19:39 34-【迷離公路】佛迪五夜驚魂系列合集 (: 100%|██████████| 10/10 [2:00:08<00:00, 720.85s/it] 

time: 2h 8s (started: 2022-10-29 10:00:22 +00:00)





In [8]:
!zip -r /content/file.zip /content/srt/

  adding: content/srt/ (stored 0%)
  adding: content/srt/32-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第一節 (廣東話).mp3.txt (deflated 97%)
  adding: content/srt/36-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第二節 (廣東話).mp3.txt (deflated 54%)
  adding: content/srt/31-【迷離公路】港澳台靈異傳聞合集 (廣東話).mp3.txt (deflated 59%)
  adding: content/srt/37-【迷離公路】日本靈異傳聞傳說 重製版 (廣東話).mp3.txt (deflated 63%)
  adding: content/srt/35-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第一節 (廣東話).mp3.txt (deflated 56%)
  adding: content/srt/34-【迷離公路】佛迪五夜驚魂系列合集 (廣東話).mp3.txt (deflated 59%)
  adding: content/srt/33-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第二節 (廣東話).mp3.txt (deflated 56%)
  adding: content/srt/40-【迷離公路】ep157 韓國三豐百貨靈異傳聞 第一節 (廣東話).mp3.txt (deflated 51%)
  adding: content/srt/38-【迷離公路】ep158 解構都市傳說 男朋友之死 第一節 (廣東話).mp3.txt (deflated 53%)
  adding: content/srt/39-【迷離公路】ep158 解構都市傳說 男朋友之死 第二節 (廣東話).mp3.txt (deflated 57%)
time: 364 ms (started: 2022-10-29 12:00:30 +00:00)


In [9]:
from google.colab import files
files.download("/content/file.zip")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

time: 7.78 ms (started: 2022-10-29 12:00:31 +00:00)


In [13]:
# import glob

# files = glob.glob('/content/ming/*')
# print(files)
# for f in files:
#     os.remove(f)

# files = glob.glob('/content/srt/*')
# for f in files:
#     os.remove(f)

# os.remove('/content/file.zip')

['/content/ming/40-【迷離公路】ep157 韓國三豐百貨靈異傳聞 第一節 (廣東話).mp3', '/content/ming/39-【迷離公路】ep158 解構都市傳說 男朋友之死 第二節 (廣東話).mp3', '/content/ming/36-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第二節 (廣東話).mp3', "/content/ming/33-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第二節 (廣東話).mp3", "/content/ming/32-【迷離公路】ep160 Silent Hill 2 與 電影 Jacob's Ladder 第一節 (廣東話).mp3", '/content/ming/37-【迷離公路】日本靈異傳聞傳說 重製版 (廣東話).mp3', '/content/ming/34-【迷離公路】佛迪五夜驚魂系列合集 (廣東話).mp3', '/content/ming/38-【迷離公路】ep158 解構都市傳說 男朋友之死 第一節 (廣東話).mp3', '/content/ming/31-【迷離公路】港澳台靈異傳聞合集 (廣東話).mp3', '/content/ming/35-【迷離公路】ep159 馬來西亞 雲頂靈異傳聞 第一節 (廣東話).mp3']
time: 47.5 ms (started: 2022-10-29 12:11:24 +00:00)


In [11]:
# !cp ./sample_data/anscombe.json ./drive/MyDrive/temp/

time: 442 µs (started: 2022-10-29 12:00:31 +00:00)


In [12]:
# from google.colab import drive
# drive.mount('/content/drive')

time: 519 µs (started: 2022-10-29 12:00:31 +00:00)
