# video.summrazer.ipynb

Download the video from Youtube
First of all, we need a way to download the video from youtube. Actually, we don’t need the whole video but only the audio. So we will extract the audio from the video and download only that.

The library I used to interoperate with youtube is youtube_dl which you can learn more about on GitHub.

So we install the library with pip and download the audio from youtube in the following way.

In [None]:
from __future__ import unicode_literals
import youtube_dl


ydl_opts = {
    'format': 'bestaudio/best',
    'postprocessors': [{
        'key': 'FFmpegExtractAudio',
        'preferredcodec': 'wav',
        'preferredquality': '192',
    }],
    'outtmpl':"." + '/video.%(ext)s',
}
with youtube_dl.YoutubeDL(ydl_opts) as ydl:
    ydl.download(['https://www.youtube.com/watch?v=Irbx9HJtexI'])
    
absolute_path = "video.wav" #file name of your downloaded audio

Note that in line 8 we chose to download the audio in wav format, but mp3 or others are also fine if you prefer.
In line 15, on the other hand, you must enter the link to the video you want.

Listen to the Audio
Did we download the audio correctly? Let’s check by resenting the audio directly from the notebook.

In [None]:

from IPython.display import Audio 
import librosa 

sampling_rate = 16_000
speech, rate = librosa.load("video.wav")

Audio(speech,rate=rate)

![alt text](image.png)

Audio to Text
The next step is to convert the audio file into text hoping to get a low word error rate. This will be useful since we can then run a summarization NLP algorithm directly on the text.

You can read more about the model we will use for text conversion to text here.

In [None]:

%%capture
!pip install transformers
from transformers import pipeline

model = "facebook/wav2vec2-large-960h-lv60-self" #speech to text

#speech to text
pipe = pipeline(model = model)
text = pipe(absolute_path, chunk_length_s=10) 

#save text
text_file = open("original_text.txt", "w")
n = text_file.write(text["text"])
text_file.close()

#read article
text_article = open("original_text.txt", "r").read()
print(len(text_article.split()))
text_article

Text Summarization
Now what’s left for us to do is to take the text we extracted from the video and summarize it.
There are hundreds of summarization models, all you have to do is go to hugging face filter on the summarization button and choose the one that best suits your case.

For this project, I will use the google/pegasus-xsum model. You can read the details of this model here (in some future articles I will also go on to explain the theory behind these summarization algorithms).

Using these pre-trained models found on HugginFace is really simple, look at I use summarization in a few lines of code.

In [None]:

!pip install sentencepiece

summarizer = pipeline("summarization", "google/pegasus-xsum")
tokenizer_kwargs = {'truncation':True,'max_length':512}
text_summerization = summarizer(text_article, min_length=30, do_sample=False,**tokenizer_kwargs)

print(text_summerization)

![alt text](image-1.png)