# YouTube Video Summary with LLM

In this project, I will work with large language models to summarize the transcript of a youtube video. For the video to be transcipted, I'm using and episode of a show called "Princess Charming" which is a dating show that is watched on a youtube channel that makes commentary.

## Install packages
To begin, we will install the packages

In [None]:
!pip install python-dotenv
!pip install --upgrade --quiet langchain
!pip install --quiet langchain-community
!pip install --upgrade --quiet langchain-together
!pip install youtube_transcript_api
!pip install pytube
# !pip install openai


Collecting youtube_transcript_api
  Downloading youtube_transcript_api-1.0.3-py3-none-any.whl.metadata (23 kB)
Downloading youtube_transcript_api-1.0.3-py3-none-any.whl (2.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.2/2.2 MB[0m [31m52.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: youtube_transcript_api
Successfully installed youtube_transcript_api-1.0.3
Collecting pytube
  Downloading pytube-15.0.0-py3-none-any.whl.metadata (5.0 kB)
Downloading pytube-15.0.0-py3-none-any.whl (57 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.6/57.6 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pytube
Successfully installed pytube-15.0.0


## Import packages
Now , we import packages needed to accomplish the task

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
import os
from dotenv import load_dotenv
from langchain_together import ChatTogether

# youtube loader from langchain to get transcript
from langchain_community.document_loaders import YoutubeLoader


In [None]:
API_KEY = '***'
load_dotenv(override=True)
# llm = ChatTogether(api_key=os.getenv('TOGETHERAI_API_KEY'), temperature=0.0, model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")
llm = ChatTogether(api_key=API_KEY, temperature=0.0, model="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo")

In [None]:
video_url = 'https://www.youtube.com/watch?v=ZcVgPkFy2RQ'
loader = YoutubeLoader.from_youtube_url(video_url, add_video_info=False)
data = loader.load()


In [None]:
# show extracted page content
data[0].page_content

"beer and pizza i [\xa0__\xa0] love tuesdays hi everyone welcome back i'm glad you're here again and i'm very happy to present to you the second episode of princess charming now i have to tell you in advance that um i've had a little bit of trouble recording this and i started drinking um like you know just a little beer when i first started recording um so this my it's not exactly my first try and i'm not exactly still at my first beer um so [Music] that's enough about me let's get back to the actual episode now before we do because it's been at least a week let's talk about what happened in the first episode everything was fine everyone got along great it was fun um kind of wholesome it was cute it was a cute first episode a great start until two people decided to have a little more the combat fight and um for which they got disqualified for um which is why princess arena did not have to send anyone home during the first episode now though um we are left with 18. contestants and um i

## Summarize text
Here, we use the LLM to summarize and extract key points from the transcript

In [None]:
# defining system and human messages for the llm to take action
messages = [
    (
        'system',
        """Read through the entire transcript carefully.
           Provide a concise summary of the video's main topic points from the transcript.
           Extract and list the five most interesting or important points from the transcript.
           For each point: State the key idea in a clear and concise manner.

        - Ensure your summary and key points capture the essence of the video without including unnecessary details.
        - Use clear, engaging language that is accessible to a general audience.
        - If the transcript includes any statistical data, expert opinions , or unique insights, prioritize including these in your summary or key points.""",

    ),
    ('human',data[0].page_content),
]

ai_msg = llm.invoke(messages)
ai_msg

AIMessage(content='**Summary:**\n\nThe second episode of "Princess Charming" focuses on the contestants\' interactions and relationships with each other, particularly with the show\'s host, Irina. The episode starts with a lunch date between Irina and several contestants, where they discuss their preferences and interests. The conversation leads to some of the contestants feeling worried about being Irina\'s type, as she seems to be attracted to more tomboyish women. The episode also features a single date between Irina and Jana, which surprisingly shows chemistry between them. However, not all interactions between Irina and the contestants result in chemistry, and some are even awkward. The episode ends with Irina having to choose someone to leave the show, which she does, and the remaining contestants are left to wonder who will be next.\n\n**Five Most Interesting or Important Points:**\n\n1. **Irina\'s Type:** The episode suggests that Irina may have a type, as she seems to be attra

## Prompt template in Langchain
Now, we do the propmt using LangChain instead

In [None]:
# set up a promt template for summarizing a video transcript using Langchain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# define a prompt tempate for summarizing video transcripts
# the template includes instructions for the AI model on how ro process the transcript
summariser_template = PromptTemplate(
    input_variables=["video_transcript"],
    template="""Read through the entire transcript carefully.
           Provide a concise summary of the video's mian topic and purpose.
           Extract and list the five most interesting or important points from the transcript.
           For each point: State the key idea in a clear and concise manner.

        - Ensure your summary and key points capture the essence of the video without including unnecessary details.
        - Use clear, engaging language that is accessible to a general audience.
        - If the transcript includes any statistical data, expert opinions , or unique insights, prioritize including these in your summary or key points.

        Video transcript: {video_transcript}""",
)



## Using LLMChain for Summarization

Now, we want to accomplish the following tasks:
*   Create an LLMChain with the custom propt template
*   Generate a summary of the video transcript using the chain



In [None]:
# invoke the chain with the vodeo transcript
chain = LLMChain(llm=llm, prompt=summariser_template)

# run the chain with the provided product details
summary = chain.invoke({
    "video_transcript": data[0].page_content})

  chain = LLMChain(llm=llm, prompt=summariser_template)


In [None]:
summary['text']

'**Summary:**\nThe video is the second episode of a reality TV show called "Princess Charming," where a group of women live together and compete for the affection of a princess, Irina. The episode focuses on the first single date between Irina and Jana, and the interactions between Irina and the other contestants, including a lunch date with Elsa, Prita, and others. The episode explores themes of identity, relationships, and chemistry between the contestants.\n\n**Five Most Interesting or Important Points:**\n\n1. **Irina\'s Type:** The episode suggests that Irina may have a type, as she invites mostly tomboyish-looking women on dates, leaving some of the more feminine-looking women worried about their chances.\n2. **Elsa and Prita\'s Chemistry:** There appears to be a strong connection between Elsa and Prita, with Elsa being very affectionate and Prita having a crush on her. However, Irina doesn\'t seem to reciprocate the feelings.\n3. **Jana\'s Date with Irina:** The first single dat

In [None]:
from IPython.display import display, Markdown
display(Markdown(summary['text']))

**Summary:**
The video is the second episode of a reality TV show called "Princess Charming," where a group of women live together and compete for the affection of a princess, Irina. The episode focuses on the first single date between Irina and Jana, and the interactions between Irina and the other contestants, including a lunch date with Elsa, Prita, and others. The episode explores themes of identity, relationships, and chemistry between the contestants.

**Five Most Interesting or Important Points:**

1. **Irina's Type:** The episode suggests that Irina may have a type, as she invites mostly tomboyish-looking women on dates, leaving some of the more feminine-looking women worried about their chances.
2. **Elsa and Prita's Chemistry:** There appears to be a strong connection between Elsa and Prita, with Elsa being very affectionate and Prita having a crush on her. However, Irina doesn't seem to reciprocate the feelings.
3. **Jana's Date with Irina:** The first single date between Irina and Jana is a surprise, and they seem to have a good time, laughing and having chemistry. However, it's unclear if this is a genuine connection or just a fun experience.
4. **Leah's Conversation with Irina:** Leah's conversation with Irina doesn't seem to have any chemistry, and it's suggested that they're on different levels. Leah references drinking and partying, which may not be Irina's style.
5. **Johanna's Struggles:** Johanna, who seemed confident in the first episode, is struggling this time around, and it's unclear if she'll be able to connect with Irina or make it through the competition.

**Key Ideas:**

1. Irina may have a type, and it's unclear if she's open to relationships with women who don't fit that mold.
2. Elsa and Prita have a strong connection, but it's unclear if Irina feels the same way.
3. Jana and Irina have chemistry, but it's unclear if this is a genuine connection or just a fun experience.
4. Leah and Irina don't seem to have any chemistry, and it's suggested that they're on different levels.
5. Johanna is struggling to connect with Irina and may be in danger of being eliminated.