# LangChain RAG Lab: Resume Q&A with Robo1! 📄🤖

Welcome to the LangChain RAG (Retrieval-Augmented Generation) Lab! In this notebook, you'll build a chatbot that answers interview questions using your resume. Robo1 is here to help—he loves finding answers in big piles of documents!

<table><tr>
<td><img src="images/robo1.png" alt="Robo1 - RAG Expert" width="120" /></td>
<td style="vertical-align:top; padding-left:20px;">
<b>Robo1 says:</b><br>
<i>"I can search your resume faster than you can say 'curriculum vitae'!"</i><br>
</td>
</tr></table>

Let's get started and see how RAG makes chatbots smarter!

# What is RAG (Retrieval-Augmented Generation)?
RAG combines the power of search (retrieval) with language models (generation). Instead of guessing, your chatbot can look up answers in documents—like your resume!

Robo1 loves RAG because it means less guessing and more knowing.

# Import Libraries and Setup
Let's import the necessary libraries and set up our environment for RAG-powered resume Q&A.

In [11]:
import os
from dotenv import load_dotenv
from langchain_openai import OpenAI
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain_openai import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.prompts import PromptTemplate
from langchain.text_splitter import RecursiveCharacterTextSplitter

# Load environment variables from .env file
load_dotenv()

# Get OpenAI API key from environment
openai_api_key = os.getenv("OPENAI_API_KEY")

# Load and Chunk Your Resume
Let's load your resume PDF and split it into smaller chunks so Robo1 can search it efficiently.

In [None]:
pdf_path = "my_resume.pdf"
loader = PyPDFLoader(pdf_path)
docs = loader.load()

# Split resume into smaller chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=50)
split_docs = text_splitter.split_documents(docs)

### Create a Vector Store and Retriever
We'll turn your resume chunks into searchable vectors so Robo1 can find the best answers fast.

In [10]:
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(split_docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})

In [12]:
# Display information about the FAISS vector store
print("Number of documents in FAISS vectorstore:", len(vectorstore.docstore._dict))
print("Document IDs:", list(vectorstore.docstore._dict.keys())[:5])  # Show first 5 IDs

# Optionally, show a sample document stored in FAISS
sample_id = list(vectorstore.docstore._dict.keys())[0]
print("Sample document content:\n", vectorstore.docstore._dict[sample_id].page_content)

# Show vector representation details
print("FAISS index shape:", vectorstore.index.ntotal, "vectors,", vectorstore.index.d, "dimensions")
sample_vector = vectorstore.index.reconstruct(0)
print("Sample vector (first 10 values):", sample_vector[:10])

Number of documents in FAISS vectorstore: 11
Document IDs: ['91c8ff07-6649-424d-81b4-d35cd433ce59', '6468db77-4c09-48ce-9c0d-337bc7ab8e00', '1e429873-0d8c-46ea-b1ae-d489794a00b2', '4fc6f51f-deb9-4dc0-877a-39e6900565ae', '1747695d-2f73-4284-bc82-7e8b5e742ab2']
Sample document content:
 K e y  S k i l l s
A b o u t  m e
      + 9 1 - 8 8 0 0 1 7 5 5 4 4
        n a v d e e p 2 2 0 1 6 1 @ g m a i l . c o m
N a v d e e p  K a u r
C o r p o r a t e   T r a i n e r  ( D a t a / C l o u d / M L  E n g i n e e r i n g )
P r e m i u m  I n s t r u c t o r  w i t h  1 4 +  y e a r s  o f  i n d u s t r y
e x p e r i e n c e  i n  d i f f e r e n t  d o m a i n s  &  m u l t i p l e
t e c h n o l o g i e s . S u c c e s s f u l l y  d e s i g n e d  a n d  c o n d u c t e d
o v e r  2 5 0 +  h a n d s - o n ,  r e a l - w o r l d  t r a i n i n g  s e s s i o n s  f o r
r e p u t e d  M N C s ,  f o c u s i n g  o n  B i g  D a t a ,  C l o u d ,  D e v O p s ,
a n d  A I  t e c h n o l o g i e 

### Build the RAG Chatbot Chain
Now we'll connect everything together so Robo1 can answer your interview questions using your resume.

In [13]:
llm = OpenAI(api_key=openai_api_key)

prompt_template = """
You are answering interview questions based on the following resume context.
If the answer is not in the context, say \"I don't know based on my resume.\"
Context: {context}
Question: {question}
Answer as if you are the candidate:
"""

qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=False,
    chain_type_kwargs={"prompt": PromptTemplate.from_template(prompt_template)}
)

# Chat with Your Resume!
Ask interview questions and see how Robo1 answers using your resume. If he can't find the answer, he'll let you know!

In [14]:
# Ask your interview question here
question = "What are your technical skills?"
answer = qa_chain.invoke({"query": question})
print("Bot:", answer)

Bot: {'query': 'What are your technical skills?', 'result': '\nBased on my resume, I have experience with multiple team and deciding on technologies, helping multiple teams to ramp up on various technologies, handling teams to create robust big data pipelines, and ensuring best and secure practices. I have also migrated teams from on-premise to cloud on Kubernetes. My experience as a Big Data Architect/Manager has allowed me to educate teams on GenAI and AI technologies, as well as introduce them to GenAI fundamentals and practical applications with LangChain and Hugging Face. In terms of technical skills, I have experience with Apache Spark/PySpark, Apache Kafka, Apache Airflow, Hadoop Ecosystem, Apache Hudi/Iceberg, Apache Beam, Kubernetes/Dockers, AWS Cloud, Google Cloud, MLOps/MLE, NoSQL databases, Snowflake/DBT, Starburst/Trino, MinIO, GenAI, Databricks, Apache Flink, and programming languages such as Scala, Python, and Java. I am also certified as a Confluent Certified Administra

---

<table><tr>
<td><img src="images/robo1.png" alt="Robo1 - RAG Expert" width="120" /></td>
<td style="vertical-align:top; padding-left:20px;">
<b>Robo1 says:</b><br>
<i>"Congratulations! You just built a RAG-powered chatbot. Now I can ace any interview—unless they ask about my favorite pizza topping!"</i><br>
<i>With RAG, your AI can find answers in documents faster than ever. 📄🤖</i>
</td>
</tr></table>

*Thanks for completing the LangChain RAG Lab!*