# Whisper-Powered Automatic Audio Processor (W-PAAP)
Notebook hackily created for Paperspace by NekoCitrus.

Adapted from the OpenAI Whisper and PPP TalkNet notebooks.

**Do not use this notebook alone to transcribe data.** Please make sure to verify and check your transcriptions before utilizing them for AI datasets/submitting them.

## Initial Setup

### GPU Check

Use this cell to show information about your graphics card.

In [None]:
!nvidia-smi -L
!nvidia-smi

### Dependencies
Install dependencies for OpenAI Whisper and other assorted elements.

In [None]:
!apt install unzip
!python -m pip install git+https://github.com/openai/whisper.git

### Path Configuration
Configure paths of your audio ZIP file.

Import your ZIP file to the root of the Paperspace filelist. Make sure that all your audio files are in the same format.

In [None]:
import os
import shutil

dataset = "your_zip.zip"
character_folder_name = "z"

pspaceName = "/notebooks/"+dataset
assert os.path.exists(pspaceName), "Cannot find your ZIP file."

#Extract data.
os.chdir('/notebooks')
if os.path.exists("/wavs"):
    shutil.rmtree("/wavs")
os.mkdir("wavs")
#shutil.copyfile(pspaceName, "/notebooks/wavs/"+dataset)
os.chdir("wavs")

if dataset[-4:] == ".zip":
    !unzip -q "{pspaceName}"
#elif dataset[-4:] == ".tar":
#    !tar -xf "{dataset}"
else:
    raise Exception("Unknown extension for dataset.")

print("ZIP file successfully loaded.")

## Running

### Transcribe
Select your model and start transcription work. You should only need to run the first cell once, regardless of how much data you are feeding into the notebook.

Do note that while the multilanguage version of the model should be downloaded, the only languages that most relevant voice AI solutions (15.ai/TalkNet/etc) support is English. If you are *positive* that all the speech in your audio files is in English, then feel free to use the English-Only models by appending "-en" to the model name.

Set the "Delete Useless" value to True if you wish to automatically delete files that are knowingly unusable by 15.ai automatically. A file will also be generated with a list of all deleted files.

In [None]:
import whisper

#Possible model options:
# tiny - 39M parameters - requires 1GB VRAM
# base - 74M parameters - requires 1GB VRAM
# small - 244M parameters - requires 2GB VRAM
# medium - 769M parameters - requires 5GB VRAM
# large - 1550M parameters - requires 10GB VRAM (paid tier only, no English-Only model available)
mod = "small" 

print("Loading Whisper model into memory...")
model = whisper.load_model(mod)
options = whisper.DecodingOptions()

In [None]:
import os
import re
from tqdm.notebook import tqdm
import whisper

delete_useless = True

print("Model \""+mod+"\" chosen.")

filename = "transcription_for_"+character_folder_name+".txt"
if(os.path.exists(filename)):
	os.remove(filename)

if(delete_useless):
    if(os.path.exists("deleted_files.txt")):
        os.remove("deleted_files.txt")

def inference(audio):
    if(delete_useless):
        aud = whisper.load_audio(audio)
        aud = whisper.pad_or_trim(aud)
        mel = whisper.log_mel_spectrogram(aud).to(model.device)
        _, probs = model.detect_language(mel)
        lang = max(probs, key=probs.get)
        if(lang!="en"):
            return ""
        result = whisper.decode(model, mel, options)
        return result.text
    result = model.transcribe(audio)
    return result["text"]

def destroy_file(filename):
    with open("deleted_files.txt", 'a') as d:
        d.write(filename+"\n")
    os.remove(filename)

print("Writing transcriptions...")

for r, _, f in os.walk("/notebooks/wavs"):
    for name in tqdm(sorted(f)):
        scrip=inference(os.path.join(r, name))

        fname = name
        if(delete_useless):
            nopu = re.sub(r'[^\w\s]', '', scrip)
            if(nopu == "" or (not nopu.isascii()) or len(nopu)<5):
                destroy_file(name)
                continue
        finScript = character_folder_name + "/" + fname + "|" + scrip.lstrip()
        with open(filename, 'a') as w:
            w.write(finScript+"\n")

shutil.move("/notebooks/wavs/"+filename, "/notebooks/")
if(delete_useless):
    shutil.move("/notebooks/wavs/deleted_files.txt", "/notebooks")

print("All done! Be sure to verify!")