# Building a RAG application from scratch

Here is a high-level overview of the system we want to build:

<img src="images/system1.png" width="1200px">

In [1]:
import os
from dotenv import load_dotenv

load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

# Youbube video we're going to use.
YOUTUBE_VIDEO = "https://www.youtube.com/watch?v=cdiD-9MMpb0"

# Setting up the model

Let's define the LLM model that we'll use as part of the workflow.

In [2]:
from langchain_openai.chat_models import ChatOpenAI

model = ChatOpenAI(openai_api_key=OPENAI_API_KEY, model="gpt-3.5-turbo")

We can test the model by passing a simple question

In [3]:
model.invoke("What MLB team won the World Series during the COVID-19 pandemic?")

AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: dasdadas. You can find your API key at https://platform.openai.com/account/api-keys.', 'type': 'invalid_request_error', 'param': None, 'code': 'invalid_api_key'}}

The result from the model is an AIMessage instance containing the answer. We can extract this answer by chaining the model with an [output parser](https://python.langchain.com/docs/modules/model_io/output_parsers/).

Here is what chaining the model with an output parser looks like:

<img src="images/chain1.png" width="1200"/>

For this example, we'll use a simple StrOutputParser to extract the answer as a string.

In [None]:
from langchain_core.output_parsers import StrOutputParser

parser = StrOutputParser()

chain = model | parser
chain.invoke("What MLB team won the World Series during the COVID-19 pandemic?")

# Introducing prompt templates

We want to provide the model with some context and the question. [Prompt templates](https://python.langchain.com/docs/modules/model_io/prompts/quick_start) are a simple way to define and reuse prompts.

In [4]:
from langchain.prompts import ChatPromptTemplate

template = """
    Answer the question based on the context below. If you can't answer the quesiton, reply 'I don't know'.

    Context: {context}

    Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)
prompt.format(context="Mary's sister is Susana", question="Who is Mary's sister?")

"Human: \n    Answer the question based on the context below. If you can't answer the quesiton, reply 'I don't know'.\n\n    Context: Mary's sister is Susana\n\n    Question: Who is Mary's sister?\n"

We can now chain the prompt with the model and the output parser.

<img src="images/chain2.png" width="1200" />

In [None]:
chain = prompt | model | parser
chain.invoke({
    "context": "Mary's sister is Susana",
    "question": "Who is Mary's sister?"
})