## **Problem Statement**

In real-world applications, organizations often store important information in documents such as **policy manuals, handbooks, or technical PDFs**. Traditional language models like GPT-4 do not have access to these documents unless explicitly given the data.

This project demonstrates how to build a **Retrieval-Augmented Generation (RAG)** system using LangChain and OpenAI. It allows the language model to **search inside a PDF document**, retrieve relevant information, and generate accurate, context-specific answers.

---

## **Objectives**

1. Load content from a PDF file into a LangChain-compatible format.
2. Split the PDF content into manageable chunks for efficient vector storage.
3. Generate embeddings for each chunk using OpenAI Embeddings.
4. Store and search the chunks using the FAISS vector store.
5. Use GPT-4 Turbo as the language model to generate context-aware answers using retrieved chunks.
6. Ensure that the implementation uses the latest LangChain and OpenAI APIs without any deprecation warnings.

---

## **Expected Outcomes**

- A working RAG pipeline that uses PDF content as a knowledge base.
- The system can answer user queries based on the information stored inside the PDF.
- It will retrieve relevant passages from the PDF and use them to generate answers via GPT-4.
- The setup will be modular, extensible, and aligned with the latest LangChain version.

---

In [1]:
# RAG Architecture
# https://miro.medium.com/v2/resize:fit:1400/1*YLrQl5CM7NjQPcfTCrf-sQ.png

# https://miro.medium.com/v2/resize:fit:1400/1*yfeUrFCr9oEVZofS8TvDEg.png

In [3]:
!pip install langchain_community -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m40.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.2/45.2 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[?25h

In [7]:
!pip install langchain_openai -q

In [8]:
import os

from langchain.document_loaders import PyPDFLoader # a class used to load text from PDF files

# Imports a powerful text-splitting tool that breaks large documents into manageable chunks for embedding.
# LLMs have context length limits, so splitting documents into chunks is essential.
# Why recursive?
# It tries to split at logical boundaries (like paragraphs, sentences, etc.) before falling back to raw character limits.
from langchain.text_splitter import RecursiveCharacterTextSplitter

# OpenAIEmbeddings - to convert text into vector format using OpenAI’s embedding API
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

from langchain.vectorstores import FAISS # a fast in-memory vector search engine used for semantic search.

"""Imports RetrievalQA, a chain that connects:
-A retriever (like FAISS)
-A language model (like GPT): To answer questions using retrieved context.

This is the heart of a RAG system — combining search + generation"""
from langchain.chains import RetrievalQA

In [14]:
!pip install openai -q
import openai


In [15]:
# Way2 (works only in Colab NB)

from google.colab import userdata
openai.api_key = userdata.get('OPENAI_API_KEY')

# Update the API key by updating the environment variable
os.environ["OPENAI_API_KEY"] = openai.api_key

In [18]:
!pip install pypdf -q

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/313.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m307.2/313.2 kB[0m [31m11.2 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m313.2/313.2 kB[0m [31m7.7 MB/s[0m eta [36m0:00:00[0m
[?25h

In [19]:
# https://python.langchain.com/docs/integrations/document_loaders/
# Step 2: Load PDF document (Replace with your actual file path)
pdf_path = "docs/company_policy.pdf"  # Make sure this file exists

# PyPDFLoader - Extracts text content from each page of the PDF. Treats each page as a separate document.
# Ignores images or tables (only extracts plain text). If your PDF has scanned images or rich tables, they won’t be properly extracted
loader = PyPDFLoader(pdf_path)

documents = loader.load()

# Print the number of pages loaded
print(f"Loaded {len(documents)} pages from the PDF.")

Loaded 24 pages from the PDF.


In [21]:
# documents

In [None]:
"""
1. PDF Loaders
Loader =>	Use When...	=> Handles Images/Tables?	=> Code Example
PyPDFLoader	Simple text-based PDFs    => No	=> loader = PyPDFLoader("path/to/pdf")
PDFMinerLoader	Needs layout-aware text extraction => No	=>	PDFMinerLoader("file.pdf")
PDFPlumberLoader	=> Needs table extraction => Basic tables	=> PDFPlumberLoader("file.pdf")
UnstructuredPDFLoader	=> Complex structure, mixed text, tables, images => Yes (OCR + layout)    => UnstructuredPDFLoader("file.pdf")
PyMuPDFLoader	Text + metadata (fast, accurate) => (Limited)	PyMuPDFLoader("file.pdf")

2. Word Documents (DOC / DOCX)
Loader	=> Use When...	=> Code
UnstructuredWordDocumentLoader	You have .docx files	UnstructuredWordDocumentLoader("file.docx")
Docx2txtLoader	Basic text extraction from .docx	Docx2txtLoader("file.docx")

3. Excel Files (XLS / XLSX)
Loader	Use When...	Code
UnstructuredExcelLoader	Full sheet extraction	UnstructuredExcelLoader("file.xlsx")
PandasExcelLoader	Structured loading using pandas	PandasExcelLoader("file.xlsx")

4. CSV Files
Loader	Use When...	Code
CSVLoader	Simple CSV reading	CSVLoader("file.csv")
PandasCSVLoader	Use Pandas for control & filtering	PandasCSVLoader("file.csv")

5. JSON / JSONL
Loader	Use When...	Code
JSONLoader	Simple .json file with nested fields    JSONLoader(file_path="file.json", jq_schema='.data[].text', text_content=False)

6. Web & HTML
Loader	Use When...	Code
WebBaseLoader	Load content from a URL	WebBaseLoader("https://example.com")
UnstructuredHTMLLoader	Clean up raw HTML structure	UnstructuredHTMLLoader("file.html")
"""

"""
Load PDF with Tables using PDFPlumberLoader
from langchain.document_loaders import PDFPlumberLoader
loader = PDFPlumberLoader("report_with_tables.pdf")
documents = loader.load()

Load Excel Sheet using UnstructuredExcelLoader
from langchain.document_loaders import UnstructuredExcelLoader
loader = UnstructuredExcelLoader("financials.xlsx")
documents = loader.load()

Load JSON File
from langchain.document_loaders import JSONLoader
loader = JSONLoader(file_path="data.json", jq_schema=".records[].summary", text_content=True)
documents = loader.load()

Load DOCX File
from langchain.document_loaders import UnstructuredWordDocumentLoader
loader = UnstructuredWordDocumentLoader("policy.docx")
documents = loader.load()
"""

"""
How to Choose the Right Loader?
| Content Type               | Recommended Loader                                          | Why                           |
| -------------------------- | ----------------------------------------------------------- | ----------------------------- |
| Simple PDFs                | `PyPDFLoader`                                               | Fast, reliable                |
| PDFs with tables           | `PDFPlumberLoader`                                          | Can extract table content     |
| PDFs with images / scanned | `UnstructuredPDFLoader`                                     | Uses OCR and layout modeling  |
| Word / Excel               | `UnstructuredWordDocumentLoader`, `UnstructuredExcelLoader` | Best structure preservation   |
| CSV / JSON                 | `PandasCSVLoader`, `JSONLoader`                             | Allows flexible preprocessing |
"""

In [22]:
# Step 3: Split PDF text into smaller chunks
# https://miro.medium.com/v2/resize:fit:1400/1*yfeUrFCr9oEVZofS8TvDEg.png
# RecursiveCharacterTextSplitter tries to split at logical boundaries (paragraph → sentence → word → character) — hence, recursive
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500, # chunk_size=500: Each chunk will be up to 500 characters long.
    chunk_overlap=100 # chunk_overlap=100: The last 100 characters of one chunk are repeated in the next chunk for context continuity.
)
chunks = text_splitter.split_documents(documents)

# Print the number of chunks created
print(f"Created {len(chunks)} text chunks from the PDF.")

# Print the second chunk as an example
print(f"Second chunk: {chunks[1]}")

Created 122 text chunks from the PDF.
Second chunk: page_content='SPIL Corporate HR Policies  
 
 
 
 
Section 1: Introduction  
 
This handbook is the summary of the policies, procedures, guidance and benefits to the employees 
and organization. It is an introduction to our vision, mission, values, what you expect from us and 
what we expect from you. We believe that employees are the assets of the organization and to 
understand them the positive work environment play an important role.' metadata={'producer': 'www.ilovepdf.com', 'creator': 'Microsoft® Word 2016', 'creationdate': '2020-08-26T06:56:00+00:00', 'author': 'hr', 'moddate': '2020-08-26T06:56:00+00:00', 'source': 'docs/company_policy.pdf', 'total_pages': 24, 'page': 1, 'page_label': '2'}


In [23]:
# Print the last chunk as an example
print(f"Last chunk: {chunks[-1]}")

Last chunk: page_content='Section : 20  Review and Amendment  
Management shall review this policy periodically and amendments required, if any shall be made 
accordingly.  
Section : 21 Residual Power 
This policy is basically guidelines and the management reserves the right to withdraw / modify to 
suit organization’s philosophy at any time without assigning any reason whatsoever. 
EFFECTIVE 
Commencement Of Policy  August 21, 2018  
 
 
 
Approved By : ___________SD/-_______________ 
Mr Sanjay Agarwal - CMD' metadata={'producer': 'www.ilovepdf.com', 'creator': 'Microsoft® Word 2016', 'creationdate': '2020-08-26T06:56:00+00:00', 'author': 'hr', 'moddate': '2020-08-26T06:56:00+00:00', 'source': 'docs/company_policy.pdf', 'total_pages': 24, 'page': 23, 'page_label': '24'}


In [None]:
"""
Why Is Chunking So Important in RAG?
LLMs (like GPT-4) have a limited context window (e.g., 8K or 32K tokens), so you can't send the whole document.
Chunking allows:
-Semantic search over smaller pieces of the document
-Better matching with the user's query
-Faster retrieval, better accuracy

Common Chunking Strategies (Used in Industry)
| Splitter Class                          | Description                                           | Use Case                                           |
| --------------------------------------- | ----------------------------------------------------- | -------------------------------------------------- |
| `RecursiveCharacterTextSplitter`        | Splits at paragraphs → sentences → words → characters | Best for generic documents (PDFs, policies, books) |
| `CharacterTextSplitter`                 | Splits at fixed character limits (naive)              | Simple logs, structured text                       |
| `TokenTextSplitter`                     | Splits by token count (uses tokenizer like tiktoken)  | Precise control for GPT-3.5/4                      |
| `SentenceTransformersTokenTextSplitter` | Aware of sentence boundaries + tokens                 | Best for multilingual and NLP-heavy documents      |
| `MarkdownHeaderTextSplitter`            | Splits based on Markdown headers                      | Technical docs, blog posts, notebooks              |
| `HTMLHeaderTextSplitter`                | Splits based on HTML tags                             | Websites, web-scraped data                         |
| `Language` Splitters                    | Code-aware (Python, JS, etc.)                         | Splits by function/class — great for code docs     |


How to Choose chunk_size and chunk_overlap?
chunk_size: How big each chunk is (in characters or tokens)
| Scenario                       | Recommended Chunk Size               |
| ------------------------------ | ------------------------------------ |
| Short emails, chat transcripts | 300–500 chars                        |
| Policies, contracts, PDFs      | 500–1000 chars                       |
| Technical manuals or FAQs      | 800–1500 chars                       |
| Code or JSON files             | 100–300 chars (or by function/class) |
| Scientific papers (dense info) | 1000–1500 chars or 256–512 tokens    |
| For OpenAI GPT-4 (8k model)    | Chunk size ≤ 1000 tokens             |
| For GPT-4 Turbo (128k model)   | Chunk size ≤ 3000 tokens             |

chunk_overlap: How much to repeat from previous chunk
| Purpose                              | Suggested Overlap                                      |
| ------------------------------------ | ------------------------------------------------------ |
| Maintain sentence continuity         | 10–20% of chunk\_size (e.g., 100 chars for 500 chunks) |
| Prevent cutoff of entity names       | 100–150 chars                                          |
| Use with semantic search (retriever) | 10–20%                                                 |
| Use with QA or summarization         | 100–200 chars                                          |

Example Best Practices:
Contract Documents:
RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
Contracts often have long clauses — more overlap helps preserve clause boundaries.

FAQs or Web Pages:
RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=100)

Code Files:
PythonCodeTextSplitter(chunk_lines=30, chunk_overlap=10)
"""

In [25]:
# Step 4: Create embeddings for each chunk using OpenAI
embeddings = OpenAIEmbeddings(api_key=openai.api_key) # text-embedding-ada-002
# embeddings = OpenAIEmbeddings(model="text-embedding-ada-002", api_key=openai_api_key)

In [27]:
!pip install faiss-cpu -q

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m31.3/31.3 MB[0m [31m53.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [28]:
# Step 5: Store chunks in FAISS vector store
"""Converts a list of text chunks into vectors using the embeddings model (e.g., OpenAIEmbeddings).
Stores those vectors inside a FAISS vector store
Returns a vectorstore object that supports fast semantic search (using cosine or inner product similarity)"""
vectorstore = FAISS.from_documents(chunks, embeddings)

In [None]:
# https://www.linkedin.com/posts/drdarshaningle-instructor_best-tips-for-vectordb-selection-for-genai-activity-7174250159881547776-wL3U?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAicoJwBJOlKakGIOE2ywfb2Viu908DpjoI

"""
What is a Vector Store?
A vector store (or vector database) is a specialized storage engine that:
-Stores high-dimensional embeddings (vectors)
-Supports similarity search (e.g., cosine similarity, dot product)
-Returns the most relevant stored chunks given a query

Used in:
-Semantic search
-RAG systems (retrieval-augmented generation)
-Recommendation engines
-Image/audio search

Common Vector Stores in the Industry (Open-source + Paid)
| Name                     | Type                     | Open Source | Cloud Hosted           | Common Use Case                       | Notes                                   |
| ------------------------ | ------------------------ | ----------- | ---------------------- | ------------------------------------- | --------------------------------------- |
| **FAISS**                | In-memory                | Yes         | No                     | Fast local prototyping                | Lightweight, fast, no persistence       |
| **Chroma**               | Embedded / local         | Yes         | Yes (through services) | Quick, simple RAG systems             | Comes with LangChain by default         |
| **Weaviate**             | Cloud-native             | Yes         | Yes                    | Enterprise-scale semantic search      | Supports hybrid search (text + vector)  |
| **Pinecone**             | Cloud-hosted             | No          | Yes                    | Scalable production RAG               | High availability, fast search          |
| **Qdrant**               | Cloud-native             | Yes         | Yes                    | High-perf vector search with metadata | Good for hybrid + multi-modal           |
| **Milvus**               | Cloud/on-prem            | Yes         | Yes                    | AI at scale (video, image, text)      | GPU acceleration available              |
| **ElasticSearch + k-NN** | General-purpose + plugin | Partially   | Yes                    | Enterprises with existing ES stack    | Not optimized for dense vectors         |
| **Redis + Redis Vector** | General-purpose + plugin | Yes         | Yes                    | Realtime vector search                | Limited advanced vector search features |

Vector Store Comparisons: Speed, Cost, and Use Cases
1. FAISS
Speed: Extremely fast (C++ backend, in-memory)
Cost: Free (open source)
Storage: In-memory only (unless manually persisted)
Best for: Prototyping, small datasets, no need for persistence
Limitations: Not suitable for production unless wrapped in a persistence layer

2. Pinecone
Speed: High (vector indexes stored in cloud)
Cost: Paid, with free tier (based on vector count + queries)
Best for: Scalable production RAG with real-time search
Organizations: Used by startups to large SaaS platforms
Strengths: Metadata filtering, hybrid search, horizontal scaling

3. Qdrant
Speed: Very good (Rust backend)
Cost: Free self-hosted; cloud pricing available
Best for: Production systems where filtering and custom payloads are needed
Organizations: ML teams, research labs, open-source apps
Strengths: JSON metadata, filtering, search inside images, multi-modal

4. Weaviate
Speed: Very good
Cost: Free (local) + paid cloud tier
Best for: NLP apps, semantic search, hybrid search
Strengths: GraphQL API, hybrid (BM25 + dense), built-in classification

5. Chroma
Speed: Fast for small-to-medium workloads
Cost: Free (open source)
Best for: Educational use, small RAGs
Limitations: Not built for production-scale retrieval yet

6. Milvus
Speed: Excellent (supports billions of vectors)
Cost: Free (community edition); paid cloud (Zilliz)
Best for: Large-scale systems (multi-modal), image/video/audio search
Organizations: Fintech, bioinformatics, autonomous systems

7. Redis Vector
Speed: Real-time optimized
Cost: Free open source; paid Redis Enterprise
Best for: If Redis is already part of your infra (e.g., caching, real-time apps)
Limitations: Lacks advanced vector indexing options

Which Vector Store Should You Use?
Use FAISS when:
You’re prototyping or building a small RAG system
You want speed without infrastructure
You don’t need persistence across sessions

Use Pinecone when:
You need scalable, production-grade retrieval
You want a managed solution (no infra worries)
You’re working with large document sets

Use Qdrant or Weaviate when:
You want metadata filtering
You want self-hosted + production-grade performance
You want to store documents + embeddings + metadata in one place

Use Milvus when:
You're working on large, multi-modal data
You need high throughput and GPU support

Real-World Scenarios:
| Use Case                                | Vector Store        |
| --------------------------------------- | ------------------- |
| Internal policy RAG for small org       | FAISS / Chroma      |
| Customer support FAQ bot (medium scale) | Qdrant / Weaviate   |
| Public-facing RAG search engine         | Pinecone / Weaviate |
| GenAI for PDFs + image documents        | Qdrant / Milvus     |
| Prototyping with LangChain locally      | FAISS / Chroma      |

Summary Table:
| Feature              | FAISS | Pinecone | Qdrant | Weaviate | Chroma  | Milvus |
| -------------------- | ----- | -------- | ------ | -------- | ------- | ------ |
| Open Source          | Yes   | No       | Yes    | Yes      | Yes     | Yes    |
| Cloud Hosted Option  | No    | Yes      | Yes    | Yes      | Limited | Yes    |
| Metadata Filtering   | No    | Yes      | Yes    | Yes      | Limited | Yes    |
| Multi-modal Support  | No    | No       | Yes    | Limited  | No      | Yes    |
| Persistence Built-in | No    | Yes      | Yes    | Yes      | Yes     | Yes    |
| Production Ready     | No    | Yes      | Yes    | Yes      | No      | Yes    |

"""

In [29]:
# Step 6: Initialize GPT-4 Turbo via LangChain
llm = ChatOpenAI(
    model="gpt-4-turbo",
    temperature=0.3,
    api_key=openai.api_key
)

In [30]:
# Step 7: Create a RetrievalQA chain (retriever + LLM)

"""Converts your vectorstore (FAISS in this case) into a retriever object.
A retriever knows how to fetch relevant chunks from the vector store using a query.
search_kwargs={"k": 3} - return the top 3 most similar chunks (based on vector similarity) for any question."""
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True # Also returns the original source chunks used in generating the answer
)

"""
What Happens Behind the Scenes in RetrievalQA:
-User provides a question → rag_chain.invoke({"query": "..."})
-retriever fetches top k matching chunks from FAISS
-LangChain injects those chunks into the prompt
-llm (e.g., GPT-4) uses that context to generate a grounded answer
-Optionally, return_source_documents=True lets you see the supporting evidence
"""

'\nWhat Happens Behind the Scenes in RetrievalQA:\n-User provides a question → rag_chain.invoke({"query": "..."})\n-retriever fetches top k matching chunks from FAISS\n-LangChain injects those chunks into the prompt\n-llm (e.g., GPT-4) uses that context to generate a grounded answer\n-Optionally, return_source_documents=True lets you see the supporting evidence\n'

In [33]:
# Step 8: Function to ask questions based on the PDF content
def ask_pdf_agent(query: str):
    print(f"\nUser Query: {query}")
    result = rag_chain.invoke({"query": query})
    # print("=="*30)
    # print(f"Entire Result: {result}")
    # print("=="*30)
    print()
    print("\nAnswer:")
    print(result["result"])
    print("\nRetrieved Passages:")
    for doc in result["source_documents"]:
        print("-->", doc.page_content.strip()[:200], "...")

In [34]:
# Step 9: Run a sample query
# ask_pdf_agent("What is the company's remote work policy?")
ask_pdf_agent("What does SPIL stand for?")


User Query: What does SPIL stand for?


Answer:
SPIL stands for Sirca Paints India Limited.

Retrieved Passages:
--> Section 3: Recruitment  
 
The company policy on recruitment strives for equal opportunity to all irrespective of any 
distinction of gender, sexual orientation, caste or any disable applicants. All a ...
--> SPIL Corporate HR Policies  
 
 
 
Section 2: Company Profile 
 
SPIL is a company engaged in marketing and trading/distribution of wood coatings and allied 
products. It is the first company to launc ...
--> Resources Department of SPIL.  
 
Applicability  
 
This EHB will be applicable to the employees working in Sirca Paints Ind ia Limited (SPIL) w.e.f 
August 21, 2020 . This book contains all the notic ...


In [35]:
ask_pdf_agent("How many leaves per month is an employee eligible for?")

# Q2:
# How many leaves per month is an employee eligible for?

# A2:
# An employee is eligible for 2 leaves per month (pro-rated based on joining date).




User Query: How many leaves per month is an employee eligible for?


Answer:
An employee is eligible for 2 leaves per month.

Retrieved Passages:
--> g) Employees who are on “Official Duty” (OD) or “On Tour” (OT) are requested to take the 
written approval in advance from their Reporting Officer and submit the same to the HR 
Department for record  ...
--> i) The Leave given to the employee will be counted on Financial Year (i.e. April to March) and the 
balance as on March will be credited to the next financial year of employee leave balance 
account.  ...
--> completion of month. The leave given to the employee will be strictly based on the Date of 
Joining (i.e. Pro-Rata Basis). Employee who joins on the following dates will be eligible for the 
leave.  
 ...


In [36]:
ask_pdf_agent("What does the acronym “RESPECT” in SPIL’s sales vision stand for?")

# Q3:
# What does the acronym “RESPECT” in SPIL’s sales vision stand for?

# A3:

# R: Reliability

# E: Excellence

# S: Service

# P: People

# E: Empowerment

# C: Caring

# T: Teamwork




User Query: What does the acronym “RESPECT” in SPIL’s sales vision stand for?


Answer:
The acronym "RESPECT" in SPIL’s sales vision stands for:

R: Reliability - You can count on us  
E: Excellence - Is our Standard  
S: Service - Customer First and accomplish the needs  
P: People - Serve People with Fairness & Firmness  
E: (Not specified in the provided text)  
C: (Not specified in the provided text)  
T: (Not specified in the provided text)  

The details for E, C, and T are not provided in the information given.

Retrieved Passages:
--> To be one of the most respectable brands in the category through brand building initiatives, 
providing world class products with consistent, quality, leading to profitability and growth of 
everyone  ...
--> innovative and cost saving solution within their total production process. "Team Sirca” works had 
to understand their customer’s products and production processes to become their most reliable & 
dep ...
--> SPIL Corporate HR Policies  
  

In [37]:
ask_pdf_agent("What are the eligibility criteria and process for the Employee Children Merit Reward?")

# Q2:
# What are the eligibility criteria and process for the Employee Children Merit Reward?

# A2:

# Based on marks obtained in school/college:

# Class V: ₹2,500 if marks > 85%

# Class VIII: ₹3,500 if marks > 80%

# Class X: ₹5,000 if marks > 80%

# Class XII: ₹10,000 if marks > 80%

# Child’s name is published on the company website and HR newsletter.


User Query: What are the eligibility criteria and process for the Employee Children Merit Reward?


Answer:
The eligibility criteria and process for the Employee Children Merit Award at Sirca Paints India Limited are as follows:

**Eligibility Criteria:**
1. The award is available to children of employees of Sirca Paints India Limited.
2. The child must achieve a certain percentage of marks in their school examinations:
   - Class V: The child must secure marks above 85%.
   - Class VIII: The child must secure marks above 80%.
   - Class X: The child must secure marks above 80%.
   - Class XII: The child must secure marks above 80%.

**Award Details:**
- Class V: Rs 2,500 as a cash reward.
- Class VIII: Rs 3,500 as a cash reward.
- Class X: Rs 5,000 as a cash reward.
- Class XII: Rs 10,000 as a cash reward.

**Process:**
- The specific process for applying for the reward is not detailed in the provided information. However, typically, such awards require submission of the student's ac

In [38]:
ask_pdf_agent("If an employee takes leave on Saturday and Monday, how is Sunday treated in attendance?")

# Q:
# If an employee takes leave on Saturday and Monday, how is Sunday treated in attendance?

# A:
# Sunday will also be counted as leave.

# As per Section 6 (Attendance & Leave), if leave is taken on both Saturday and Monday, the Sunday in between is also treated as a leave.


User Query: If an employee takes leave on Saturday and Monday, how is Sunday treated in attendance?


Answer:
If an employee takes leave on both Saturday and Monday, then Sunday will also be treated as a leave according to the corporate HR policies.

Retrieved Passages:
--> 1961. 
 
q) If the employee will take the continuous leave after festival then the festival will be considered 
as a Leave. 
 
r) Sunday will be treated as a leave, if the employee will take a leave o ...
--> SPIL Corporate HR Policies  
 
 
 
l) Marketing Department will submit the “Daily Time Report” (DTR) to their Reporting Officers 
and based upon their DTR the attendance will be marked. Reporting Offi ...
--> a) Employees are required to register their attendance electronically while reporting to work and 
before leaving the office through Attendance Biometric installed in the office premises, if not 
done ...


In [39]:
ask_pdf_agent("Can an employee claim compensatory leave for working on a public holiday and then club it with another leave on Monday?")

# Q:
# Can an employee claim compensatory leave for working on a public holiday and then club it with another leave on Monday?

# A:
# No.

# Compensatory leave cannot be clubbed with other leaves to create long leaves. For example, if Saturday is claimed as Compensatory Leave and Monday as normal leave, Comp Off will not be granted.


User Query: Can an employee claim compensatory leave for working on a public holiday and then club it with another leave on Monday?


Answer:
Yes, an employee can claim compensatory leave for working on a public holiday and then club it with another leave on Monday. According to the provided rules, an employee who works on a public holiday is eligible for compensatory leave. After obtaining this compensatory leave, the employee can use it in conjunction with other leaves, such as taking a leave on the following Monday. However, it's important for the employee to inform or get approval from their Reporting Officer and HR Department regarding the arrangement of their leaves.

Retrieved Passages:
--> 1961. 
 
q) If the employee will take the continuous leave after festival then the festival will be considered 
as a Leave. 
 
r) Sunday will be treated as a leave, if the employee will take a leave o ...
--> Employee who work on Weekly off/Public Holiday due to some project or assignment em

# Happy Learning