<a href="https://colab.research.google.com/github/zsj1zsj/OpenAI_Whisper_ASR_Demo/blob/main/OpenAI_Whisper_ASR_Demo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Web App Demonstrating OpenAI's Whisper Speech Recognition Model

This is a Colab notebook that allows you to record or upload audio files to [OpenAI's free Whisper speech recognition model](https://openai.com/blog/whisper/). This was based on [an original notebook by @amrrs](https://github.com/amrrs/openai-whisper-webapp), with added documentation and test files by [Pete Warden](https://twitter.com/petewarden).

To use it, choose `Runtime->Run All` from the Colab menu. If you're viewing this notebook on GitHub, follow [this link](https://colab.research.google.com/github/petewarden/openai-whisper-webapp/blob/main/OpenAI_Whisper_ASR_Demo.ipynb) to open it in Colab first. After about a minute or so, you should see a button at the bottom of the page with a `Record from microphone` link. Click this, you'll be asked to give permission to access your mic, and then speak for up to 30 seconds. Once you're done, press `Stop recording`, and a transcript of the first 30 seconds of your speech should soon appear in the box to the right of the recording button. To transcribe more speech, click `Clear' in the left box and start over.

You can also upload your own audio samples using the folder icon on the left of this page. That gives you access to a file system you can upload to by dragging files into it. You can see examples of how to run the transcription in a couple of the cells below.

## Install the Whisper Code

In [1]:
! pip install git+https://github.com/openai/whisper.git -q

[K     |████████████████████████████████| 5.3 MB 16.0 MB/s 
[K     |████████████████████████████████| 7.6 MB 55.7 MB/s 
[K     |████████████████████████████████| 163 kB 60.7 MB/s 
[?25h  Building wheel for whisper (setup.py) ... [?25l[?25hdone


## Load the ML Model

In [None]:
import whisper

model = whisper.load_model("medium")


 77%|████████████████████████████▎        | 2.20G/2.87G [00:43<00:12, 57.3MiB/s]

## Check we have a GPU

You should see the output `device(type='cuda', index=0)` below. If you don't, you may be on a CPU-only Colab instance which will run more slowly. Go to `Runtime->Change Runtime Type` to fix this.

In [None]:
model.device

In [None]:
!pip install ipython-autotime
%load_ext autotime

In [None]:
!mkdir ming
!mkdir srt

# from google.colab import drive
# drive.mount('/content/drive')

import os
from datetime import datetime
import glob



def sortfilebysize(dir_name):
  list_of_files = filter( os.path.isfile,
                        glob.glob(dir_name + '*') )
  list_of_files = sorted( list_of_files,
                        key =  lambda x: os.stat(x).st_size)
  return list_of_files

sortfilebysize('/content/ming/')

In [None]:
# !pip install tqdm
import tqdm
import glob
from datetime import datetime, timezone, timedelta

tz = timezone(timedelta(hours=+8))
now = datetime.now(tz)
nowstr = now.strftime("%H:%M")

# files = sorted(glob.glob('/content/ming/*'))
files = sortfilebysize('/content/ming/')

pbar = tqdm.tqdm(files)
for file in pbar:
    now = datetime.now(tz)
    nowstr = now.strftime("%H:%M")
    pbar.set_description(nowstr+' '+file[14:-8])

    test_txt = model.transcribe(file)

    srtPath = '/content/srt/'+ file[14:] + '.txt'
    with open(srtPath, mode = 'w', encoding = 'utf-8') as f: 
      f.write(test_txt['text'])

20:44 41-【迷離公路】ep157 韓國三豐百貨靈異傳聞 第二節 (:  70%|███████   | 7/10 [25:15<11:14, 224.99s/it]

In [None]:
!zip -r /content/file.zip /content/srt/

In [None]:
from google.colab import files
files.download("/content/file.zip")

In [None]:
# import glob

# files = glob.glob('/content/ming/*')
# print(files)
# for f in files:
#     os.remove(f)

# files = glob.glob('/content/srt/*')
# for f in files:
#     os.remove(f)

# os.remove('/content/file.zip')

In [None]:
# !cp ./sample_data/anscombe.json ./drive/MyDrive/temp/

In [None]:
# from google.colab import drive
# drive.mount('/content/drive')