# 📚 Lecture 2: Introduction to Retrieval-Augmented Generation (RAG)

In this notebook, we'll explore how **RAG (Retrieval-Augmented Generation)** allows GPT-4 to provide more accurate and grounded answers by retrieving relevant documents before generating a response.

We'll first see how GPT-4 responds without external knowledge, then enhance the response using a document retriever with **LangChain**, **FAISS**, and **OpenAI Embeddings**.

In [3]:
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from dotenv import load_dotenv
import os

load_dotenv()

llm = ChatOpenAI(
    temperature=0,
    model='gpt-4',
    api_key=os.getenv("OPENAI_API_KEY")
)

## 🤖 Step 2: Ask GPT-4 Without RAG

Let’s see how GPT-4 answers when we ask a question **without providing any documents**.

In [4]:
# Question not grounded in a document
question = "What is LangChain used for?"
print(llm.invoke(question))

content='LangChain is used for language translation. It is a decentralized translation solution that uses blockchain technology and artificial intelligence to provide high-quality translation services. It aims to eliminate intermediaries in the translation process, reduce costs, and increase efficiency. The platform also allows translators to be rewarded for their work with LangChain tokens.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 63, 'prompt_tokens': 14, 'total_tokens': 77, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4-0613', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None} id='run-b5324c37-6ac2-4a3b-8421-75b093cfaae3-0' usage_metadata={'input_tokens': 14, 'output_tokens': 63, 'total_tokens': 77, 'input_token_details': {'audio': 0, '

## 🧠 Step 3: Add External Knowledge with RAG

We’ll now index a short document and use **RetrievalQA** to feed that document contextually to GPT-4 before it answers.

In [5]:
# Sample docs
docs = [
    "LangChain is an open-source framework that helps developers build applications powered by language models.",
    "It enables agents to interact with tools, memory, and external data sources.",
    "LangChain supports Retrieval-Augmented Generation to improve answer accuracy."
]

embeddings = OpenAIEmbeddings(api_key=os.getenv("OPENAI_API_KEY"))
vectorstore = FAISS.from_texts(docs, embeddings)
retriever = vectorstore.as_retriever()

In [6]:
# Setup RetrievalQA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    chain_type="stuff"
)

# Ask the same question again
response = qa_chain.run(question)
print(response)

  response = qa_chain.run(question)


LangChain is used to help developers build applications that are powered by language models. It supports Retrieval-Augmented Generation to improve answer accuracy and enables agents to interact with tools, memory, and external data sources.


## ✅ Conclusion
With RAG, GPT-4 has access to **your specific knowledge**, which improves accuracy and reduces hallucinations.

**RAG = Retrieval + Generation** — a powerful way to make AI context-aware.

In the next lecture, we’ll scale this further using larger document sets, error handling, and advanced search.