# Summarize Youtube videos into text

### Installing necessary libraries


In [None]:
%pip install pytube
%pip install playsound
%pip install openai

### Downloading audio clip from the YouTube video

In [82]:
from pytube import YouTube
import os
import subprocess

In [36]:
def download_youtube_audio(url, destination):
    # Create a YouTube object
    yt = YouTube(url)

    # Select the audio stream
    audio_stream = yt.streams.filter(only_audio=True).first()

    # Download the audio stream
    out_file = audio_stream.download(output_path=destination)

    # Set up new filename
    base, ext = os.path.splitext(out_file)
    new_file = base + '.mp3'

    # Convert file to mp3
    subprocess.run(['ffmpeg', '-i', out_file, new_file])

    # Remove the original file
    os.remove(out_file)

    print(f"Downloaded and converted to MP3: {new_file}")
    return new_file

### Input Youtube link

In [37]:
url = input("Enter the YouTube URL: ")

# url = 'https://www.youtube.com/watch?v=reUZRyXxUs4' # as test

Enter the YouTube URL: https://www.youtube.com/watch?v=reUZRyXxUs4


In [41]:
# Set the destination path for the download
destination = "audiofiles/"

file_path = download_youtube_audio(url, destination)

Downloaded and converted to MP3: /content/audiofiles/How AI Could Empower Any Business  Andrew Ng  TED.mp3


### Converting the audio into text using **Whisper 1**

In [77]:
def write_text_to_file(text, filename="transcribed_text.txt"):
  # Write the text to the file
  with open(filename, "w") as file:
      file.write(text)

In [None]:
from openai import OpenAI

client = OpenAI(api_key= os.getenv("OPENAI_API_KEY"))

# Assuming you have an audio file ready for transcription
response = client.audio.transcriptions.create(
    file=open(file_path, 'rb'),
    model="whisper-1"
)

In [None]:
# The response object will contain the transcription
write_text_to_file(str(response))

### Converting transcript into summarized text

In [81]:
import openai

# Assuming 'response_string' contains the string to be used as a prompt
preprompt = 'You are a model that gets a transcribtion for a youtube video and turns it into a well structured summarization of the overall video, highlighting important details and providing more context when necessary. Try to be very detailed when it gets to non-trivial parts and the summary should have at least 20-30 % of the actual text length'
prompt = preprompt + str(response)


response = client.chat.completions.create(
  model="gpt-3.5-turbo",
  messages=[
    {"role": "system", "content": preprompt},
    {"role": "user", "content": prompt},
  ]
)
# The 'response' will contain the completion from the model
print(response.choices[0].message.content)
write_text_to_file(response.choices[0].message.content, filename='summary.txt')

The video explores the historical context of literacy and its significance in understanding the current landscape of accessibility in Artificial Intelligence (AI). It draws parallels between the exclusivity of literacy in the past, limited to a select few, and the current concentration of AI expertise in the hands of skilled engineers at major tech companies. The high costs and specialized skills required for AI projects have created barriers to entry for smaller businesses and individuals, hindering their ability to leverage AI technology effectively.

A key point made in the video is the necessity of democratizing AI access, especially for small businesses like local pizza stores or small T-shirt companies. These businesses hold valuable data that, when combined with AI capabilities, could optimize their operations and boost revenue. A contrast is drawn between AI projects that primarily benefit large tech corporations and those that have the potential to empower smaller businesses o