## Parse video id

In [60]:
urlparse('https://www.youtube.com/watch?v=XfpMkf4rD6E&t=477s')

ParseResult(scheme='https', netloc='www.youtube.com', path='/watch', params='', query='v=XfpMkf4rD6E&t=477s', fragment='')

In [61]:
parse_qs('v=XfpMkf4rD6E&t=477s')

{'v': ['XfpMkf4rD6E'], 't': ['477s']}

In [1]:
from urllib.parse import parse_qs, urlparse



ALLOWED_SCHEMAS = {"http", "https"}
ALLOWED_NETLOCK = {
    "youtu.be",
    "m.youtube.com",
    "youtube.com",
    "www.youtube.com",
    "www.youtube-nocookie.com",
    "vid.plus",
}


def _parse_video_id(url: str):
    """Parse a youtube url and return the video id if valid, otherwise None."""
    parsed_url = urlparse(url)

    if parsed_url.scheme not in ALLOWED_SCHEMAS:
        return None

    if parsed_url.netloc not in ALLOWED_NETLOCK:
        return None

    path = parsed_url.path

    if path.endswith("/watch"):
        query = parsed_url.query
        parsed_query = parse_qs(query)
        if "v" in parsed_query:
            ids = parsed_query["v"]
            video_id = ids if isinstance(ids, str) else ids[0]
        else:
            return None
    else:
        path = parsed_url.path.lstrip("/")
        video_id = path.split("/")[-1]

    if len(video_id) != 11:  # Video IDs are 11 characters long
        return None

    return video_id

In [2]:
_parse_video_id('https://youtu.be/XfpMkf4rD6E?si=XAcOaguDIW95Ou2S')

'XfpMkf4rD6E'

In [5]:
from youtube_transcript_api import YouTubeTranscriptApi

# https://www.youtube.com/watch?v=XfpMkf4rD6E&t=477s
    
    
video_id = "XfpMkf4rD6E"

ans = YouTubeTranscriptApi.get_transcript(video_id)

In [7]:
from datetime import datetime, timedelta

def format_time(seconds):
    # Convert seconds to timedelta object
    delta = timedelta(seconds=seconds)
    # Extract hours, minutes, and seconds
    hours, remainder = divmod(delta.seconds, 3600)
    minutes, seconds = divmod(remainder, 60)
    # Format the time as HH:MM:SS
    formatted_time = '{:02}:{:02}:{:02}'.format(hours, minutes, seconds)
    return formatted_time

def combine_transcript(transcript):
    combined_transcript = []
    current_interval_start = transcript[0]['start']
    current_interval_text = ""

    for item in transcript:
        if item['start'] - current_interval_start > 900:  # Check if 5 minutes have passed
            duration = item['start'] - current_interval_start
            combined_transcript.append({
                'text': current_interval_text.strip(),
                'start': format_time(current_interval_start),
                'end': format_time(current_interval_start + duration),
                'duration': format_time(duration)
            })
            current_interval_start = item['start']
            current_interval_text = ""

        current_interval_text += item['text'] + " "

    # Append the remaining text as the last interval
    duration = transcript[-1]['start'] - current_interval_start
    combined_transcript.append({
        'text': current_interval_text.strip(),
        'start': format_time(current_interval_start),
        'end': format_time(current_interval_start + duration),
        'duration': format_time(duration)
    })

    return combined_transcript


In [8]:
combined_transcript = combine_transcript(ans)


In [9]:
len(combined_transcript)

5

"hi everyone welcome to cs25 Transformers United video this was a course that was solid Stanford in the winter 2023 this course is not about robots that can transform into cars as this picture might suggest rather it's about deep learning models that have taken the World by the storm and have revolutionized the field of AI and others starting from natural language processing Transformers have been applied all over from computer vision enforcement learning biology robotics Etc we have an exciting set of videos lined up for you with some truly fascinating speakers skip talks presenting how they're applying Transformers to the research in different fields and areas we hope you'll enjoy and learn from these videos so without any further Ado let's get started this is a purely introductory lecture and we'll go into the building blocks of Transformers so first let's start with introducing the instructors uh so for me I'm currently on a temporary data Pro from the PHD program and I'm leading A

In [7]:
import tiktoken
# To get the tokeniser corresponding to a specific model in the OpenAI API:

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

In [58]:
num_tokens_from_string(combined_transcript[1]['text'], "cl100k_base")

2224

In [43]:
len(enc.encode('hello worlds'))

2

In [8]:
combined_transcript[0]['text']

"hi everyone welcome to cs25 Transformers United video this was a course that was solid Stanford in the winter 2023 this course is not about robots that can transform into cars as this picture might suggest rather it's about deep learning models that have taken the World by the storm and have revolutionized the field of AI and others starting from natural language processing Transformers have been applied all over from computer vision enforcement learning biology robotics Etc we have an exciting set of videos lined up for you with some truly fascinating speakers skip talks presenting how they're applying Transformers to the research in different fields and areas we hope you'll enjoy and learn from these videos so without any further Ado let's get started this is a purely introductory lecture and we'll go into the building blocks of Transformers so first let's start with introducing the instructors uh so for me I'm currently on a temporary data Pro from the PHD program and I'm leading A

In [10]:
from openai import AsyncOpenAI

client = AsyncOpenAI(
        # defaults to os.environ.get("OPENAI_API_KEY")
#         api_key=os.getenv('OPENAI_API_KEY')
    )



user_message = f""" TRANSCRIPT: {combined_transcript[0]['text']}.

You need to perform these tasks step by step.

1. Read all the transcript provided above.
2. Generate the important titles, and for each title, you must describe them in bullet points.


IMPORTANT POINT TO CONSIDER
1. Your response should be well structured, informative, in depth and comprehensive, with facts and numbers if available.
2. You should strive to write the as long as you can using all relevant and necessary information provided.
3. DONOT include conslusion in your response.
4. Please give your response in markdown syntax
5. Please do your best, this is very important to my career. 
6. DONOT include any other text in your response besides what's described above.
7. DONOT generate Numbered titles.

Your response format should be in this format

``
## Title 1
- Explanation 1
- Explanation 2

## Title 2
- Explanation 1
- Explanation 2

``



Please do your best, this is very important to my career. 
"""


messages = [
    {"role": "system", "content": "You are a youtube assistant that assists students in explaining topics from a youtube transcript."},
    {"role": "user", "content": user_message},
]

chat_completion = await client.chat.completions.create(
            messages=messages, model='gpt-3.5-turbo-0125'
        )

final_ans = chat_completion.choices[0].message.content

In [11]:
print(final_ans)


## Introduction to CS25 Transformers United Course
- The course was taught at Stanford in winter 2023.
- Focuses on deep learning models known as Transformers which have revolutionized AI.
- Transformers applied in various fields like natural language processing, computer vision, and robotics.

## Instructors' Introduction and Background
- One instructor works in AI robotics startup and is passionate about Robotics and building learning algorithms.
- Another instructor has a background in NLP and computer vision, also shares interests in music and martial arts.
- Third instructor shares excitement about teaching Transformers again and mentions involvement in previous course offerings.

## Evolution of Transformers: A Historical Timeline
- Attention mechanisms in AI evolved from RNNs and LSTMs to Transformers starting in 2017.
- Transformers gained popularity in NLP, leading to a surge of applications in various fields.
- By 2021, notable advancements included long sequence problem-sol

In [12]:
import asyncio
from openai import OpenAI

client = AsyncOpenAI()

async def generate_notes(transcript):
    user_message = f""" TRANSCRIPT: {transcript}.

    You need to perform these tasks step by step.

    1. Read all the transcript provided above.
    2. Generate the important titles, and for each title, you must describe them in bullet points.


    IMPORTANT POINT TO CONSIDER
    1. Your response should be well structured, informative, in depth and comprehensive, with facts and numbers if available.
    2. You should strive to write the as long as you can using all relevant and necessary information provided.
    3. DONOT include conslusion in your response.
    4. Please give your response in markdown syntax
    5. Please do your best, this is very important to my career. 
    6. DONOT include any other text in your response besides what's described above.
    7. DONOT generate Numbered titles.

    Your response format should be in this format:

    ## Title 1
    - Explanation 1
    - Explanation 2

    ## Title 2
    - Explanation 1
    - Explanation 2




    Please do your best, this is very important to my career. 
    """


    messages = [
        {"role": "system", "content": "You are a youtube assistant that assists students in explaining topics from a youtube transcript."},
        {"role": "user", "content": user_message},
    ]
    chat_completion =  await client.chat.completions.create(messages=messages, model='gpt-3.5-turbo-0125')
    return chat_completion.choices[0].message.content

async def generate_all():
    tasks = [generate_notes(transcript) for transcript in combined_transcript]
    results = await asyncio.gather(*tasks)
    return results

# Run the main function
# final_notes = asyncio.run(main())

results = await generate_all()


In [13]:
len(results)

5

In [20]:
combined_results = ''
for i in results:
    combined_results += i + '\n\n'

In [21]:
print(combined_results)


## Topic: Introduction to Transformers in AI
- Transformers are deep learning models that have revolutionized the field of AI.
- They have been widely applied in various domains such as natural language processing, computer vision, reinforcement learning, biology, and robotics.
- The lecture serves as an introductory session to discuss the basics of Transformers and their impact on the AI landscape.
- The instructors are professionals with expertise in AI research, robotics, NLP, and computer vision.

## Topic: Evolution of Attention Mechanisms
- The timeline of attention mechanisms started with the 2017 paper on Transformer architecture, paving the way for a new era of deep learning models.
- Before Transformers, models like RNNs and LSTMs had limitations in handling long sequences and context prediction, which Transformers addressed effectively.
- Attention mechanisms in Transformers excel in context prediction and determining word connections, leading to better performance in tasks

In [15]:
for i in results:
    print('------------------------------------------------')
    print(i)

------------------------------------------------

## Topic: Introduction to Transformers in AI
- Transformers are deep learning models that have revolutionized the field of AI.
- They have been widely applied in various domains such as natural language processing, computer vision, reinforcement learning, biology, and robotics.
- The lecture serves as an introductory session to discuss the basics of Transformers and their impact on the AI landscape.
- The instructors are professionals with expertise in AI research, robotics, NLP, and computer vision.

## Topic: Evolution of Attention Mechanisms
- The timeline of attention mechanisms started with the 2017 paper on Transformer architecture, paving the way for a new era of deep learning models.
- Before Transformers, models like RNNs and LSTMs had limitations in handling long sequences and context prediction, which Transformers addressed effectively.
- Attention mechanisms in Transformers excel in context prediction and determining word co