#### An AI Agent has four key components:

</br>Brain (LLM) - The language model that thinks and reasons
</br>Tools - Functions the agent can call (search, calculator, APIs)
</br>Memory - Remembers conversation history and context
</br>Planning - Breaks down complex tasks into steps

**Goal**: index a small text file, create retriever, wire a retrieval chain and get answers.

Steps & code
* Load documents, split into chunks
* Create embeddings & vectorstore (Chroma used as example)
* Create a retriever
* Wire a create_retrieval_chain and run queries

In [4]:
#Load the data
from langchain_community.document_loaders.text import TextLoader
loader = TextLoader('COMPANY POLICY HANDBOOK.txt')
docs = loader.load()

In [5]:
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter=RecursiveCharacterTextSplitter(chunk_size=400,chunk_overlap=50)
final_documents=text_splitter.split_documents(docs)
final_documents

[Document(metadata={'source': 'COMPANY POLICY HANDBOOK.txt'}, page_content='COMPANY POLICY HANDBOOK\n1. INTRODUCTION AND PURPOSE'),
 Document(metadata={'source': 'COMPANY POLICY HANDBOOK.txt'}, page_content='1. INTRODUCTION AND PURPOSE\nThis Company Policy Handbook outlines the principles, rules, and procedures that govern the organization’s operations, employee behavior, data management, and compliance obligations. The objective is to create a consistent, transparent, and ethical workplace where all employees, contractors, and partners operate with professionalism, respect, and accountability.'),
 Document(metadata={'source': 'COMPANY POLICY HANDBOOK.txt'}, page_content='All employees are required to read, understand, and comply with the policies contained herein. These policies are subject to change at the discretion of the Management Board, and all updates will be communicated through the official communication channels, including internal email notifications and the company intrane

In [6]:
#Vecctor Embeddings:
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2")

In [7]:

from langchain_community.vectorstores import FAISS
vector_store = FAISS.from_documents(
    final_documents,  # list of Document objects
    embeddings,
)


In [8]:
vector_store.save_local("faiss_index")

# Later reload
vector_store = FAISS.load_local(
    "faiss_index",
    embeddings,
    allow_dangerous_deserialization=True
)


In [10]:
# Retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 2})
docs = retriever.invoke("What is the leave policy?")
for d in docs:
    print("**************")
    print(d.page_content)

**************
Sick Leave: As per medical certificate and labor laws.

Maternity/Paternity Leave: Provided according to statutory regulations.

Emergency Leave: Subject to managerial approval.

All leave requests must be submitted through the HR portal and approved by supervisors.

5. DATA PROTECTION AND PRIVACY POLICY
5.1 Purpose
**************
4.2 Attendance and Absenteeism

Attendance records are maintained through the HR management system. Unauthorized absence for more than three consecutive days may be considered job abandonment. Repeated tardiness may result in disciplinary action.

4.3 Leave Policies

Employees are entitled to various types of leave:

Annual Leave: Minimum of 20 working days per year.


In [11]:
import os
from dotenv import load_dotenv
load_dotenv()
os.environ["OPENAI_API_KEY"]=os.getenv("OPENAI_API_KEY")


In [15]:
# Chain (create_retrieval_chain)
from langchain_openai import ChatOpenAI
# document-combining chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
# retrieval-augmented generation (RAG) chain
from langchain.chains import create_retrieval_chain


llm = ChatOpenAI(temperature=0, model='gpt-3.5-turbo')
# simple prompt: default prompt from hub can be used, keep it concise here
doc_chain = create_stuff_documents_chain(llm)
retrieval_chain = create_retrieval_chain(retriever, doc_chain)


resp = retrieval_chain.invoke({'input': 'How many paid leaves do employees get per year?'})
print(resp['answer'])

ModuleNotFoundError: No module named 'langchain.chains'

Retriever → fetches relevant information from your knowledge base.

Doc chain → organizes that information into a prompt.

LLM → produces an answer grounded in context.

This allows you to answer questions on large document collections without feeding all documents to the LLM