**AI Developer Task: RAG-based Chatbot (Company Policies)**

A Retrieval-Augmented Generation (RAG) chatbot that can answer questions related to company
policies.

**Task Description**

Create a chatbot that:
1. Ingests and stores a few text or PDF files containing sample company policies (you can use
public HR or IT policy templates if needed).
2. Uses retrieval + generation to answer user queries — e.g.:
○ “What is the company’s leave policy?”
○ “How can employees request remote work?”
3. Retrieves the most relevant document chunks using embeddings + vector search.
4. Generates a natural, concise response based only on retrieved context (no hallucination).
5. Shows the retrieved source text or document name as citation.

# **Creating Data (Company Policies)**

In [None]:
# Create folder named "data"
!mkdir -p data

# Create sample text files

# HR policy
with open('data/hr_policy.txt', 'w') as f:
    f.write('''Company HR Policy:

1. Equal Employment Opportunity:
The company is committed to providing equal employment opportunities to all employees and applicants without regard to race, gender, religion, or disability.

2. Code of Conduct:
Employees are expected to maintain professionalism, integrity, and respect toward colleagues, clients, and company property.

3. Performance Reviews:
Formal performance evaluations are conducted twice a year to assess employee progress and set goals for the upcoming period.

4. Work Hours:
Standard work hours are from 9:00 AM to 6:00 PM, Monday through Friday.
''')

# IT policy
with open('data/it_policy.txt', 'w') as f:
    f.write('''Company IT Policy:

1. Device Usage:
Employees must use company-approved devices for all work-related tasks. Personal devices may only be used with written approval from the IT department.

2. Data Security:
All confidential data must be stored on secure company servers. Sharing credentials or accessing restricted systems without authorization is prohibited.

3. Internet and Email Usage:
Company internet and email services should be used primarily for business purposes. Misuse for personal activities or offensive content is subject to disciplinary action.

4. Software Installation:
Only IT-approved software may be installed on company devices. Unauthorized software installations are not permitted.
''')

# Leave policy
with open('data/leave_policy.txt', 'w') as f:
    f.write('''Company Leave Policy:

1. Annual Leave:
All full-time employees are entitled to 20 working days of paid annual leave per calendar year. Leave must be requested at least 7 days in advance.

2. Sick Leave:
Employees are entitled to 10 days of paid sick leave per year. A medical certificate must be submitted for absences exceeding two days.

3. Maternity and Paternity Leave:
Female employees are entitled to 6 months of paid maternity leave. Male employees are entitled to 15 days of paid paternity leave.

4. Emergency Leave:
Up to 3 days of emergency leave may be granted in special circumstances, subject to managerial approval.

5. Leave Encashment:
Unused annual leave may be carried forward for up to one year or encashed at the end of the calendar year.
''')

# Remote Work policy
with open('data/remote_work_policy.txt', 'w') as f:
    f.write('''Company Remote Work Policy:

1. Eligibility:
Employees who have completed at least 6 months of service are eligible to apply for remote work arrangements.

2. Work Hours:
Remote employees must adhere to standard company work hours and remain available for virtual meetings during this time.

3. Communication:
All remote employees must be reachable through official communication channels such as company email, Teams, or Slack during working hours.

4. Equipment and Security:
Employees are responsible for maintaining company-provided equipment in good condition. VPN access must be used to connect to company systems.

5. Performance Monitoring:
Supervisors will evaluate remote employees based on deliverables, deadlines, and productivity metrics rather than time spent online.
''')

In [None]:
# Verify data
!ls data

hr_policy.txt  it_policy.txt  leave_policy.txt	remote_work_policy.txt


# **Install Dependencies and Libraries**

In [None]:
!pip install langchain langchain-community langchain-core faiss-cpu sentence-transformers transformers fastapi uvicorn pyngrok


Collecting langchain-community
  Downloading langchain_community-0.4.1-py3-none-any.whl.metadata (3.0 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.12.0-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (5.1 kB)
Collecting pyngrok
  Downloading pyngrok-7.4.1-py3-none-any.whl.metadata (8.1 kB)
INFO: pip is looking at multiple versions of langchain-community to determine which version is compatible with other requirements. This could take a while.
Collecting langchain-community
  Downloading langchain_community-0.4-py3-none-any.whl.metadata (3.0 kB)
  Downloading langchain_community-0.3.31-py3-none-any.whl.metadata (3.0 kB)
Collecting requests<3,>=2 (from langchain)
  Downloading requests-2.32.5-py3-none-any.whl.metadata (4.9 kB)
Collecting dataclasses-json<0.7.0,>=0.6.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7.0,>=0.6.7->langchain-community)
 

In [None]:
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_community.document_loaders import TextLoader, DirectoryLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA
from langchain_community.llms import HuggingFacePipeline
from transformers import pipeline
from langchain.memory import ConversationBufferMemory

# **Load the Data**

In [None]:
# Load all files in data/
loader = DirectoryLoader("data", glob="*.txt", loader_cls=TextLoader)
documents = loader.load()

# Split into chunks for better retrieval
text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)
docs = text_splitter.split_documents(documents)


# **Create Embeddings and FAISS Vector Database**

In [None]:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(docs, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})


  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
Error while fetching `HF_TOKEN` secret value from your vault: 'Requesting secret HF_TOKEN timed out. Secrets can only be fetched when running from the Colab UI.'.
You are not authenticated with the Hugging Face Hub in this notebook.
If the error persists, please let us know by opening an issue on GitHub (https://github.com/huggingface/huggingface_hub/issues/new).


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

# **Load Model (Flan-T5)**

In [None]:
# Load a small instruction-tuned model
model_name = "google/flan-t5-base"
pipe = pipeline("text2text-generation", model=model_name, tokenizer=model_name, max_length=256)

llm = HuggingFacePipeline(pipeline=pipe)


config.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/990M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/147 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

spiece.model:   0%|          | 0.00/792k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json: 0.00B [00:00, ?B/s]

Device set to use cuda:0
  llm = HuggingFacePipeline(pipeline=pipe)


# **Buil RAG Chain and Memory Setup**

In [None]:
# Conversation memory setup
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Conversational rag chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    return_source_documents=True,
    output_key="result"
)

In [None]:
!pip install -q nest_asyncio

# **Testing**

In [None]:
query = "How many month can a female employee get paid maternity leave?"
result = qa_chain.invoke({"query": query})

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata["source"])


Answer: 6 months

Sources:
- data/leave_policy.txt
- data/remote_work_policy.txt


In [None]:
query = "What about a male employee get paid maternity leave?"
result = qa_chain.invoke({"query": query})

print("Answer:", result["result"])
print("\nSources:")
for doc in result["source_documents"]:
    print("-", doc.metadata["source"])

Answer: Male employees are entitled to 15 days of paid paternity leave.

Sources:
- data/leave_policy.txt
- data/hr_policy.txt


# **Integrating StreamLit**

In [None]:
%%writefile streamlit_rag_app.py
# --- Streamlit RAG Chatbot with Memory and Citations ---

import streamlit as st
from langchain.chains import RetrievalQA
from langchain.memory import ConversationBufferMemory
from langchain_community.llms import HuggingFacePipeline
from langchain_community.vectorstores import FAISS
from langchain_community.embeddings import HuggingFaceEmbeddings
from transformers import pipeline

# --- Streamlit Page Setup ---
st.set_page_config(page_title="RAG Chatbot", layout="wide")
st.title("RAG Chatbot with Memory and Citations")
st.caption("Powered by FLAN-T5 + FAISS + LangChain + Streamlit")

# --- Load Local LLM ---
model_name = "google/flan-t5-base"
pipe = pipeline("text2text-generation", model=model_name, tokenizer=model_name, max_length=256)
llm = HuggingFacePipeline(pipeline=pipe)

# --- Load Embeddings & Vectorstore ---
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = FAISS.load_local("faiss_index", embeddings, allow_dangerous_deserialization=True)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})

# --- Memory Setup (Important fix: input_key + output_key) ---
memory = ConversationBufferMemory(
    memory_key="chat_history",
    input_key="query",
    output_key="result",
    return_messages=True
)

# --- RAG Chain Setup ---
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=retriever,
    memory=memory,
    return_source_documents=True,
    output_key="result"
)

# --- Initialize Chat History ---
if "messages" not in st.session_state:
    st.session_state.messages = []

# --- User Input ---
query = st.chat_input("Ask your question about company policy...")

if query:
    with st.spinner("Thinking..."):
        response = qa_chain.invoke({"query": query})
        answer = response["result"]
        sources = response.get("source_documents", [])

        st.session_state.messages.append({"role": "user", "content": query})
        st.session_state.messages.append({"role": "assistant", "content": answer, "sources": sources})

# --- Display Conversation ---
for msg in st.session_state.messages:
    if msg["role"] == "user":
        with st.chat_message("user"):
            st.write(msg["content"])
    else:
        with st.chat_message("assistant"):
            st.write(msg["content"])
            if msg.get("sources"):
                with st.expander("Sources"):
                    for i, doc in enumerate(msg["sources"], 1):
                        st.markdown(f"**{i}.** `{doc.metadata.get('source', 'Unknown')}`")

# --- Clear Chat Button ---
if st.button("Clear Chat"):
    st.session_state.messages = []
    memory.clear()
    st.rerun()


Overwriting streamlit_rag_app.py


In [None]:
!ls

data  faiss_index  sample_data	streamlit_rag_app.py


In [None]:
import faiss
import numpy as np

# Save the FAISS vectorstore locally so streamlit_rag_app.py can load it
vectorstore.save_local("faiss_index")

In [None]:
!pip install streamlit
#!npm install -g localtunnel

[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K⠴[1G[0K⠦[1G[0K⠧[1G[0K⠇[1G[0K⠏[1G[0K⠋[1G[0K⠙[1G[0K⠹[1G[0K⠸[1G[0K⠼[1G[0K
added 22 packages in 2s
[1G[0K⠼[1G[0K
[1G[0K⠼[1G[0K3 packages are looking for funding
[1G[0K⠼[1G[0K  run `npm fund` for details
[1G[0K⠼[1G[0K

In [None]:
!pip install -q pyngrok



In [None]:
from pyngrok import ngrok
ngrok.set_auth_token("35OT5VZAfFDDdTKlR4rcTedzslw_3z4a72BsZte2hSjZGYsoZ")
public_url = ngrok.connect(8501)
print("Public URL:", public_url)

Public URL: NgrokTunnel: "https://polyzoic-inalienably-rodolfo.ngrok-free.dev" -> "http://localhost:8501"


In [None]:
!streamlit run streamlit_rag_app.py --server.port 8501



Collecting usage statistics. To deactivate, set browser.gatherUsageStats to false.
[0m
[0m
[34m[1m  You can now view your Streamlit app in your browser.[0m
[0m
[34m  Local URL: [0m[1mhttp://localhost:8501[0m
[34m  Network URL: [0m[1mhttp://172.28.0.12:8501[0m
[34m  External URL: [0m[1mhttp://35.197.58.236:8501[0m
[0m
2025-11-13 08:49:40.929296: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1763023780.950737   29213 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1763023780.957232   29213 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1763023780.973490   29213 computation_placer.cc:177] computation placer already regist



[34m  Stopping...[0m


In [None]:
from pyngrok import ngrok
ngrok.kill()