In [1]:
!pip install moviepy pydub git+https://github.com/openai/whisper.git

Collecting git+https://github.com/openai/whisper.git
  Cloning https://github.com/openai/whisper.git to /tmp/pip-req-build-0nwqmp1z
  Running command git clone --filter=blob:none --quiet https://github.com/openai/whisper.git /tmp/pip-req-build-0nwqmp1z
  Resolved https://github.com/openai/whisper.git to commit ba3f3cd54b0e5b8ce1ab3de13e32122d0d5f98ab
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting pydub
  Downloading pydub-0.25.1-py2.py3-none-any.whl.metadata (1.4 kB)
Collecting tiktoken (from openai-whisper==20231117)
  Downloading tiktoken-0.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch->openai-whisper==20231117)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torc

In [3]:

import os
import whisper
from moviepy.editor import VideoFileClip
from pydub import AudioSegment

# Function to extract audio from video
def extract_audio(video_file, output_audio_file):
    print("Extracting audio from video...")
    video = VideoFileClip(video_file)
    audio = video.audio
    audio.write_audiofile(output_audio_file)
    print(f"Audio extracted to {output_audio_file}")

# Function to convert audio to WAV format mono PCM with 16kHz sample rate
def convert_audio(input_audio_file, output_audio_file):
    print("Converting audio to WAV format mono PCM with 16kHz sample rate...")
    audio = AudioSegment.from_file(input_audio_file)
    audio = audio.set_channels(1)  # Convert to mono
    audio = audio.set_frame_rate(16000)  # Set sample rate to 16kHz
    audio.export(output_audio_file, format="wav")
    print(f"Audio converted to {output_audio_file}")

# Function to transcribe audio using Whisper
def transcribe_audio_whisper(audio_file):
    model = whisper.load_model("base")
    result = model.transcribe(audio_file)
    return result['text']

# Function to add line breaks to transcript
def format_transcript(transcript, max_length=80):
    words = transcript.split()
    formatted_transcript = ""
    current_line = ""

    for word in words:
        if len(current_line) + len(word) + 1 > max_length:
            formatted_transcript += current_line + "\n"
            current_line = word
        else:
            if current_line:
                current_line += " " + word
            else:
                current_line = word

    if current_line:
        formatted_transcript += current_line

    return formatted_transcript

# Main function to generate transcript
def generate_transcript(video_file):
    temp_audio_file = "temp_audio.mp3"
    final_audio_file = "final_audio.wav"

    # Extract audio from video
    extract_audio(video_file, temp_audio_file)
    # Convert audio to WAV format mono PCM with 16kHz sample rate
    convert_audio(temp_audio_file, final_audio_file)

    # Transcribe the entire audio file using Whisper
    print("Transcribing audio file...")
    transcript = transcribe_audio_whisper(final_audio_file)

    # Clean up temporary audio files
    os.remove(temp_audio_file)
    os.remove(final_audio_file)
    print("Temporary files cleaned up.")

    # Format transcript with line breaks
    formatted_transcript = format_transcript(transcript)

    return formatted_transcript

# Example usage in Jupyter Notebook or Google Colab
video_file = "/content/260 - Sorting Strings.mp4"  # Replace with your video file path

print("Generating transcript...")
transcript = generate_transcript(video_file)
print("Transcript generated:")
print(transcript)

# Optionally save the transcript to a file
with open("transcript.txt", "w") as f:
    f.write(transcript)
print("Transcript saved to transcript.txt")


Generating transcript...
Extracting audio from video...
MoviePy - Writing audio in temp_audio.mp3




MoviePy - Done.
Audio extracted to temp_audio.mp3
Converting audio to WAV format mono PCM with 16kHz sample rate...
Audio converted to final_audio.wav
Transcribing audio file...





Temporary files cleaned up.
Transcript generated:
The next thing we're going to take a look at is how we can sort in array of
strings. So I'm going to update my data array. I'm going to change it to T,
capital A, lowercase A, capital B and lowercase B. Just a little bit ago before
we introduced this comparator function, I told you that by default JavaScript is
going to take all the elements inside an array, turn them into strings, and then
compare them. So that might lead you to believe that JavaScript can sort in
array of strings really well. Once again, not quite the case. So let me
demonstrate that to you very quickly. I'm going to remove the comparator
function because this one is really only designed to work with numbers. So I'm
going to try calling data.sort and I'm going to see what I get back. So I'm
going to run that and I get back something that definitely looks like it has
changed, but it's not quite what I think I would be looking for if I was sorting
in array of strings. I