References

1. [Whisper](https://blog.devgenius.io/transcribing-youtube-videos-using-openais-whisper-%EF%B8%8F-%EF%B8%8F-a29d264d6fb1)
2. [Langchain and LLama](https://www.youtube.com/watch?v=k_1pOF1mj8k)

### Basic Imports

In [1]:
import yt_dlp


In [2]:
def download(video_id: str) -> str:
    video_url = f'https://www.youtube.com/watch?v={video_id}'
    ydl_opts = {
        'format': 'm4a/bestaudio/best',
        'paths': {'home': 'audio/'},
        'outtmpl': {'default': '%(id)s.%(ext)s'},
        'postprocessors': [{
            'key': 'FFmpegExtractAudio',
            'preferredcodec': 'm4a',
        }]
    }
    with yt_dlp.YoutubeDL(ydl_opts) as ydl:
        error_code = ydl.download([video_url])
        if error_code != 0:
            raise Exception('Failed to download video')

    return f'audio/{video_id}.m4a'

In [3]:
download('CuBzyh4Xmvk')

[youtube] Extracting URL: https://www.youtube.com/watch?v=CuBzyh4Xmvk
[youtube] CuBzyh4Xmvk: Downloading webpage
[youtube] CuBzyh4Xmvk: Downloading ios player API JSON
[youtube] CuBzyh4Xmvk: Downloading android player API JSON
[youtube] CuBzyh4Xmvk: Downloading m3u8 information
[info] CuBzyh4Xmvk: Downloading 1 format(s): 140
[download] audio/CuBzyh4Xmvk.m4a has already been downloaded
[download] 100% of   72.26MiB
[ExtractAudio] Not converting audio audio/CuBzyh4Xmvk.m4a; file is already in target format m4a


'audio/CuBzyh4Xmvk.m4a'

In [4]:
import whisper

In [5]:
whisper_model = whisper.load_model("base.en")


In [6]:
transcription = whisper_model.transcribe("audio/CuBzyh4Xmvk.m4a", fp16=True, verbose=False)

100%|██████████| 468481/468481 [02:05<00:00, 3720.37frames/s]


In [9]:
transcription["text"][:100]

" Please look at the code mentioned above and please sign up on the Google Cloud. We've already start"

In [7]:
transcription.keys()

dict_keys(['text', 'segments', 'language'])

In [10]:
def create_srt_from_transcription(transcription_objects, srt_file_path):
    with open(srt_file_path, 'w') as srt_file:
        index = 1  # SRT format starts with index 1

        for entry in transcription_objects['segments']:
            start_time = entry['start']
            end_time = entry['end']
            text = entry['text']

            # Convert time to SRT format
            start_time_str = format_time(start_time)
            end_time_str = format_time(end_time)

            # Write entry to SRT file
            srt_file.write(f"{index}\n")
            srt_file.write(f"{start_time_str} --> {end_time_str}\n")
            srt_file.write(f"{text}\n\n")

            index += 1

def format_time(time_seconds):
    minutes, seconds = divmod(time_seconds, 60)
    hours, minutes = divmod(minutes, 60)
    return f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d},000"


In [11]:
create_srt_from_transcription(transcription, "audio/CuBzyh4Xmvk.srt")

In [12]:
!head audio/CuBzyh4Xmvk.srt

1
00:00:00,000 --> 00:00:05,000
 Please look at the code mentioned above and please sign up on the Google Cloud.

2
00:00:05,000 --> 00:00:08,000
 We've already started making some announcements.

3
00:00:08,000 --> 00:00:14,000


In [13]:
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler 
                                 

In [14]:
llm = Ollama(model="mistral", 
             callback_manager = CallbackManager([StreamingStdOutCallbackHandler()]))

In [15]:
prompt_qs = ["Please provide a bullet-point summary for the given text:",
             "Summarize the following in Markdown bullets:",
             "Highlight the important topics and subtopics in the given lecture:",
             "Give us some question for a quiz based on the following text:"]

prompts = [q + "\n" + transcription["text"] for q in prompt_qs]

for prompt, prompt_qs in zip(prompts, prompt_qs):
    print(prompt_qs, end="\n\n")
    output = llm(prompt)
    print(output, end="\n\n")
    print("=="*50, end="\n\n")

Please provide a bullet-point summary for the given text:

 * The text asks for attention to the code and signing up on Google Cloud, as well as an announcement of an extra lecture
* Machine learning definition: ability for computers to learn without being explicitly programmed
* The example task is to write a program to recognize digits from a dataset
* Rules for recognizing digits include vertical and horizontal lines, similar height of vertical lines, and no star or other mark on the digit
* Slides and videos from first lecture have been put on Google Cloud
* Previous lecture covered definition of machine learning and difference between explicit and implicit programming
* Decision trees will be used to predict whether a day is good for playing tennis based on weather conditions
* Decision trees involve creating rules based on attributes (in this case, outlook and humidity) and choosing the attribute that gives the best performance gain
* Entropy is a measure of disorder or uncertain