# 🤖 HR Chatbot Assignment using LangChain and Organization Documents

In this assignment, you'll build a simple HR chatbot using **LangChain**, powered by **Groq's LLaMA-3 API**, and answer employee queries based on your company documents.

We'll guide you step-by-step, but you will complete the code blocks yourself using the hints provided.

---

## 📌 Learning Objectives
- Use LangChain to build a simple Retrieval-Augmented Generation (RAG) pipeline
- Integrate Groq LLaMA-3 with LangChain
- Load HR documents and retrieve relevant info
- Answer employee queries using context-aware LLMs


## 🛠️ Step 1: Install Required Packages
Install the required packages using pip.

In [39]:
# !pip install langchain_groq

In [40]:
# TODO: Install the required packages
# !pip install langchain faiss-cpu python-dotenv sentence-transformers

## 🔐 Step 2: Load API Key from `.env`
Make sure you have your Groq API key saved in `.env`.

In [41]:
# TODO: Load the API key using `load_dotenv()` and setup the LLM using LangChain
from dotenv import load_dotenv
load_dotenv()

True

## 📄 Step 3: Load Documents
Use LangChain's `PyPDFLoader` or `TextLoader` to load files from `data/hr_docs/`.

In [None]:
# TODO: Load documents from PDF or TXT
from langchain.document_loaders import TextLoader
from pathlib import Path

txt_docs_path = Path("Data/hr_docs")
txt_files = list(txt_docs_path.glob("*.txt"))

txt_documents = []
for file in txt_files:
    loader = TextLoader(str(file), encoding="utf-8")
    txt_documents.extend(loader.load())

print(f"Total documents loaded: {len(txt_documents)}")
print(txt_documents[1].page_content[:500])  # Preview first document

Total documents loaded: 7
Employee Benefits - TechNova Solutions

1. Health Insurance: Covered up to ₹5,00,000 annually for employee + dependents.
2. Work From Home: Available up to 3 days per week with manager approval.
3. Internet Reimbursement: ₹1,000 per month for remote workers.
4. Learning & Development: Annual budget of ₹15,000 for courses, certifications.
5. Wellness: Free access to yoga and meditation apps.
6. Food Coupons: Monthly Sodexo coupons worth ₹2,000 for all full-time employees.



## ✂️ Step 4: Chunk Documents
Use `RecursiveCharacterTextSplitter` to split documents into manageable pieces.

In [43]:
# TODO: Split documents into smaller chunks
from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 300,
    chunk_overlap = 50,
    separators = ["\n\n", "\n", ".", " "]
)

chunks = text_splitter.split_documents(txt_documents)

print(f"Total chunks created: {len(chunks)}")
print(chunks[1].page_content[:300])

Total chunks created: 18
3. Use of Company Assets:
   - Laptops and software licenses must be used only for work purposes.

4. Social Media:
   - Employees should avoid discussing confidential or controversial company topics online.


## 🧠 Step 5: Embed and Store in FAISS
Generate embeddings and store them in a FAISS vector store.

In [44]:
# TODO: Generate embeddings and store in FAISS
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings

embedding_model = HuggingFaceEmbeddings(model_name = "sentence-transformers/all-MiniLM-L6-v2")

# Create the FAISS index
vectorstore = FAISS.from_documents(documents=chunks, embedding=embedding_model)

In [45]:
vectorstore.save_local("faiss_index/hr_chatbot")

## 🔍 Step 6: Setup RetrievalQA Chain
Create a chain that can retrieve chunks and answer questions.

In [46]:
vectorstore = FAISS.load_local(
    "faiss_index/hr_chatbot",
    embedding_model,
    allow_dangerous_deserialization=True
)

In [47]:
from langchain_groq import ChatGroq

llm = ChatGroq(
    model="deepseek-r1-distill-llama-70b",
    temperature=0,
    max_tokens=None,
    reasoning_format="parsed",
    timeout=None,
    max_retries=2,
)

In [None]:
# TODO: Create RetrievalQA chain
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

qa_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3}),
    memory=memory,
    verbose=False
)


## 💬 Step 7: Ask Your Chatbot
Now ask a question to your chatbot!

In [49]:
print("🤖 HR Chatbot is ready! (type 'exit' to quit)\n")

while True:
    user_input = input("You: ")
    if user_input.lower() in ["exit", "quit"]:
        print("👋 Goodbye! Stay compliant!")
        break

    try:
        response = qa_chain.invoke({"question": user_input})
        print("Bot:", response["answer"].strip(), "\n")
    except Exception as e:
        print("⚠️ Oops! Something went wrong:", str(e), "\n")

🤖 HR Chatbot is ready! (type 'exit' to quit)

Bot: The total number of leaves available for a woman is calculated by summing up all applicable leaves:

- **Annual Leave**: 18 days
- **Sick Leave**: 12 days
- **Casual Leave**: 8 days
- **Maternity Leave**: 182 days (26 weeks)

Adding these together: 18 + 12 + 8 + 182 = 220 days.

**Answer:** The total number of leaves available for a woman is 220 days. 

Bot: The total number of leaves available for a man is 48 days, comprising:

- Annual Leave: 18 days
- Sick Leave: 12 days
- Casual Leave: 8 days
- Paternity Leave: 10 days

**Answer:** The total number of leaves for a man is 48 days. 

Bot: The total number of leaves available for men is calculated by summing up all applicable leaves:

- **Annual Leave**: 18 days
- **Sick Leave**: 12 days
- **Casual Leave**: 8 days
- **Paternity Leave**: 10 days
- **Floating Holidays**: 2 days

Adding these together: 18 + 12 + 8 + 10 + 2 = **50 days**.

**Answer:** The total number of leaves for men is

## ✅ Done!
Test different queries and documents to explore the chatbot's responses.