# YouTube - summary generator
Use YT videos transcript and summarize it

In [7]:
! pip install youtube-transcript-api
! pip install pytube

Collecting youtube-transcript-api
  Obtaining dependency information for youtube-transcript-api from https://files.pythonhosted.org/packages/33/c1/18e32c7cd693802056f385c3ee78825102566be94a811b6556f17783c743/youtube_transcript_api-0.6.1-py3-none-any.whl.metadata
  Downloading youtube_transcript_api-0.6.1-py3-none-any.whl.metadata (14 kB)
Downloading youtube_transcript_api-0.6.1-py3-none-any.whl (24 kB)
Installing collected packages: youtube-transcript-api
Successfully installed youtube-transcript-api-0.6.1
Collecting pytube
  Downloading pytube-15.0.0-py3-none-any.whl (57 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.6/57.6 kB[0m [31m9.8 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: pytube
Successfully installed pytube-15.0.0


In [2]:
import os
from dotenv import load_dotenv
from langchain.document_loaders import YoutubeLoader
from langchain.llms import OpenAI
from langchain.chains.summarize import load_summarize_chain

In [4]:
load_dotenv() # Load environment variables from .env file
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # Get API key from environment variable
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"] = "ls__7fe96d9dfa664e99ad0e78c2f9302178"
os.environ["LANGCHAIN_PROJECT"] = "youtube_template"

1. Simple Video

In [5]:
loader = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=pNcQ5XXMgH4", add_video_info=True)

In [8]:
result = loader.load()

In [10]:
print (type(result))
print (f"Found video from {result[0].metadata['author']} that is {result[0].metadata['length']} seconds long")
print(result[0].metadata)
print ("")
print (result)

<class 'list'>
Found video from Greg Kamradt (Data Indy) that is 668 seconds long
{'source': 'pNcQ5XXMgH4', 'title': 'LangChain 101: YouTube Transcripts + OpenAI', 'description': 'Unknown', 'view_count': 17761, 'thumbnail_url': 'https://i.ytimg.com/vi/pNcQ5XXMgH4/hqdefault.jpg?sqp=-oaymwEXCJADEOABSFryq4qpAwkIARUAAIhCGAE=&rs=AOn4CLCmP9TXvB4nm22ZX7b5Tl0AagEU3A', 'publish_date': '2023-02-23 00:00:00', 'length': 668, 'author': 'Greg Kamradt (Data Indy)'}

[Document(page_content="what is going on good people again right now we have a super exciting tutorial because we are going to take YouTube transcripts and we're going to pass them to open Ai and the way that we're going to do that is via a library called Lang chain which is what this entire series is about now before we jumped into it I wanted to show a diagram again I think these diagrams are helpful but you have to let me know so just let me know in the comments here so I wanted to do an overview about what we're actually going to be w

In [11]:
llm = OpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

In [12]:
# Summarize
chain = load_summarize_chain(llm, chain_type="stuff", verbose=False)
chain.run(result)

' This tutorial explains how to use the Lang Chain library to take YouTube transcripts and pass them to Open AI to generate a summary. It also explains how to use the recursive character splitter to split up long transcripts into smaller chunks, and how to use the mapreduce method to generate a summary of multiple videos. Finally, it explains how to use the summarize scan to generate a summary of multiple videos.'

### 2. Long Video (use map-reduce)

    When video is too long, it will not fit into context window. Use map-reduce chain using chanks of transcript

In [19]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
texts = text_splitter.split_documents(result)

In [20]:
len(texts)

14

In [21]:
#print(texts)
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)



[1m> Entering new MapReduceDocumentsChain chain...[0m


[1m> Entering new LLMChain chain...[0m
Prompt after formatting:
[32;1m[1;3mWrite a concise summary of the following:


"what is going on good people again right now we have a super exciting tutorial because we are going to take YouTube transcripts and we're going to pass them to open Ai and the way that we're going to do that is via a library called Lang chain which is what this entire series is about now before we jumped into it I wanted to show a diagram again I think these diagrams are helpful but you have to let me know so just let me know in the comments here so I wanted to do an overview about what we're actually going to be writing out in code because I think it's a little easier to see in pictures first so the way this is going to work is we're going to have a video a YouTube video we're going to pass it we're going to pass it a URL and then what Lang chain is going to help us do is it's going to help us load this 

' This tutorial explains how to use Open AI and Lang Chain to generate summaries of multiple documents, such as YouTube videos. It provides a diagram to help visualize the code and explains how to use the mapreduce method to combine documents into one concise summary. The speaker also encourages viewers to leave comments about how the videos can be improved and about their own business problems.'

### 3. Multiple Videos

In [22]:
youtube_url_list = ["https://www.youtube.com/watch?v=UfL7hqGBLAQ", "https://www.youtube.com/watch?v=9z7p28FhoEc"]

texts = []

text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=0)

for url in youtube_url_list:
    loader = YoutubeLoader.from_youtube_url(url, add_video_info=True)
    result = loader.load()
    
    texts.extend(text_splitter.split_documents(result))

In [24]:
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=False)
chain.run(texts)

" Congressman Elise Stefanik of New York's 21st Congressional District is supportive of the impeachment inquiry into President Biden and is committed to helping re-elect President Trump in 2024. A search and rescue operation is underway in Morocco following a devastating earthquake that killed 2,900 people and left hundreds of thousands homeless. International search and rescue teams have been invited by the Moroccan military to the town of Amiz, and volunteers, family, friends, local charities, and NOS are helping people in the mountains. There has been criticism that other international agencies and countries have not been able to get into Morocco."