# Building a RAG application for Youtube


Let's start by loading the environment variables.

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")


## Setting up the model
Define the LLM model that we'll use as part of the workflow.

In [2]:
from langchain_openai.chat_models import ChatOpenAI

model = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-3.5-turbo")

 Test the model

In [3]:
model.invoke("Who won the Tokyo Olympic Badminton Men Single during the COVID-19 pandemic?")

AIMessage(content="In the Tokyo Olympic Games held during the COVID-19 pandemic in 2021, Viktor Axelsen of Denmark won the gold medal in the Men's Singles Badminton event.", response_metadata={'token_usage': {'completion_tokens': 36, 'prompt_tokens': 24, 'total_tokens': 60}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b9a61582-4c5f-432e-94e7-3e647b70d5b5-0')

In this example, use a simple `StrOutputParser` to extract the asnwer as string

In [4]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser
chain.invoke("Who won the Tokyo Olympic Badminton Men Single during the COVID-19 pandemic?")

"During the COVID-19 pandemic, the Tokyo Olympic Badminton Men's Singles event was won by Viktor Axelsen of Denmark. He defeated Chen Long of China in the final to claim the gold medal."

## Introducing prompt templates

We use prompt templates to give the model the necessary context and the question. [Prompt templates](https://python.langchain.com/v0.1/docs/modules/model_io/prompts/quick_start/) are an efficient way to define and reuse prompts.

In [5]:
from langchain_core.prompts import ChatPromptTemplate

template = """
Answer the question based on the context below. 
If you can't answer the question, reply "I don't know".

Context: {context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
prompt.format (context="Jane's brother is John", question="Who is John's brother?")

'Human: \nAnswer the question based on the context below. \nIf you can\'t answer the question, reply "I don\'t know".\n\nContext: Jane\'s brother is John\n\nQuestion: Who is John\'s brother?\n'

We can now chain the prompt with the model and the output parser.

In [6]:
chain = prompt | model | parser
chain.invoke({"context": "Jane's brother is John", 
              "question":"Who is Jane's Sister?"})

"I don't know."

## Combining chains

In [31]:
translation_prompt = ChatPromptTemplate.from_template(
    "Translate {answer} to {language}"
    )

In [38]:
from operator import itemgetter

translation_chain = (
    {"answer": chain, "language": itemgetter("language")} | translation_prompt | model | parser
)

translation_chain.invoke(
    {
        "context": "Jane's brother is John. She doesn't have any more sigblings.",
        "question": "How many bother does Jane have?",
        "language": "Chinese",
    }
)

'简有一个名叫约翰的兄弟。'

## Transcribing the YouTube Video

We want to send the context from a Youtube video to the model. Let download the video and transcribe it using OpenAI's Whisper.

In [30]:
YOUTUBE_VIDEO = "https://www.youtube.com/watch?v=U_g63qlLFHw"
#YOUTUBE_VIDEO = "https://www.youtube.com/watch?v=NcD3RP8vHI4"

import tempfile
import whisper
from pytube import YouTube

if not os.path.exists("transcription.txt"):
    youtube = YouTube(YOUTUBE_VIDEO)
    audio = youtube.streams.filter(only_audio=True).first() 

    # Let's load the base model. This is not the most accurate
    # model but it's fast.
    whisper_model = whisper.load_model("base")

    with tempfile.TemporaryDirectory() as tmpdir:
        file = audio.download(output_path=tmpdir)
        transcription = whisper_model.transcribe(file, fp16=False)["text"].strip()

        with open("transcription.txt", "w") as file:
            file.write(transcription)

Python(45078) MallocStackLogging: can't turn off malloc stack logging because it was not enabled.
