## Naive RAG

### Load API Keys

In [1]:
import os
from dotenv import load_dotenv

OPENAI_API_KEY=os.getenv("OPENAI_API_KEY")

### Setup Langsmith Tracking and API Key

In [2]:
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGSMITH_API_KEY"]=os.getenv("LANGSMITH_API_KEY")
os.environ["LANGCHAIN_PROJECT"]="NAIVE_RAG"

## Load LLM model from OpenAI

In [3]:
from langchain_openai import ChatOpenAI


llm = ChatOpenAI(model="gpt-4.1-nano",
                    api_key=OPENAI_API_KEY,
                    temperature=0.5,
                    max_tokens=512 )

### Test LLM 

In [4]:
test_llm_response=llm.invoke("What is Large Language Models")
test_llm_response.content

"Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand, generate, and interpret human language. They are built using deep learning techniques, particularly neural networks with many layers, and are trained on vast amounts of text data from books, websites, and other sources. This extensive training enables LLMs to recognize patterns, grasp context, and produce coherent and contextually relevant text across a wide range of topics.\n\nExamples of LLMs include OpenAI's GPT series (like GPT-3 and GPT-4), Google's BERT, and others. These models are used in various applications such as chatbots, translation services, content creation, summarization, question-answering systems, and more. Their ability to generate human-like language has significantly advanced natural language processing (NLP) capabilities, making interactions with machines more natural and efficient."

## Load Text embedding model from OpenAI

In [5]:
from langchain_openai import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(
    model="text-embedding-3-small",
)

### Test Embedding model

In [6]:
embedding_vector=embedding_model.embed_query("What is Large Language Models")

In [7]:
len(embedding_vector)

1536

## Load Documents

### CSV Loader

In [8]:
from langchain_community.document_loaders import CSVLoader

loader = CSVLoader(file_path="sample_docs/ElectroTV_Sales_Report_2024.csv")

csv_data = loader.load()

In [9]:
print(csv_data[0].page_content)

order_id: ORD00165
date: 2024-11-02
product_name: ElectroTV E32 Smart
units_sold: 2
unit_price_inr: 14999
total_sales_inr: 29998
sales_region: Central
sales_channel: Online


### PDF Loader

In [10]:
from langchain_community.document_loaders import PyPDFLoader

loader=PyPDFLoader(file_path="sample_docs/ElectroTV.pdf")

pdf_data = loader.load()

In [11]:
print(pdf_data[0].metadata['source'])

sample_docs/ElectroTV.pdf


### Merge CSV and PDF data

In [12]:
documents = csv_data + pdf_data

In [13]:
documents[0]

Document(metadata={'source': 'sample_docs/ElectroTV_Sales_Report_2024.csv', 'row': 0}, page_content='order_id: ORD00165\ndate: 2024-11-02\nproduct_name: ElectroTV E32 Smart\nunits_sold: 2\nunit_price_inr: 14999\ntotal_sales_inr: 29998\nsales_region: Central\nsales_channel: Online')

In [14]:
documents[-1]

Document(metadata={'producer': 'LibreOffice 24.2', 'creator': 'Writer', 'creationdate': '2025-12-25T14:00:02+05:30', 'source': 'sample_docs/ElectroTV.pdf', 'total_pages': 11, 'page': 10, 'page_label': '11'}, page_content='Andheri East,\nMumbai ‚Äì 400069, Maharashtra\n Phone: +91-22-4455-9900üìû\n Email: west.sales@electrotv.comüìß\nEast Region Office\nElectroTV Regional Office ‚Äì East\nInfinity IT Park, Block B,\nSalt Lake Sector V ,\nKolkata ‚Äì 700091, West Bengal\n Phone: +91-33-4098-1122üìû\n Email: east.support@electrotv.comüìß\nCustomer Care & Service Support\nFor product installation, troubleshooting, warranty information, and service requests, customers \nmay contact our centralized support team.\n Toll-Free: 1800-555-ETV1 (1800-555-3881)üìû\n Email: support@electrotv.comüìß\n Website: www.electrotv.com üåê (fictional)\n11')

## Document Splitting

In [15]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=100)

chunks = text_splitter.split_documents(documents)

In [16]:
len(chunks)

530

In [17]:
print(chunks[0].page_content)

order_id: ORD00165
date: 2024-11-02
product_name: ElectroTV E32 Smart
units_sold: 2
unit_price_inr: 14999
total_sales_inr: 29998
sales_region: Central
sales_channel: Online


In [18]:
print(chunks[-1].page_content)

Andheri East,
Mumbai ‚Äì 400069, Maharashtra
 Phone: +91-22-4455-9900üìû
 Email: west.sales@electrotv.comüìß
East Region Office
ElectroTV Regional Office ‚Äì East
Infinity IT Park, Block B,
Salt Lake Sector V ,
Kolkata ‚Äì 700091, West Bengal
 Phone: +91-33-4098-1122üìû
 Email: east.support@electrotv.comüìß
Customer Care & Service Support
For product installation, troubleshooting, warranty information, and service requests, customers 
may contact our centralized support team.
 Toll-Free: 1800-555-ETV1 (1800-555-3881)üìû
 Email: support@electrotv.comüìß
 Website: www.electrotv.com üåê (fictional)
11


### Add ids to chunks

In [19]:
from uuid import uuid4

uuids = [str(uuid4()) for _ in range(len(chunks))]

In [20]:
len(uuids)

530

In [21]:
uuids[:5]

['9c15b8e0-7f30-44af-9322-c76e261a0427',
 '4280af25-6325-4014-b140-78c1fb3cc861',
 'b0267033-753e-42f8-86bb-2f116c2d4fca',
 '3af6194a-473a-49cc-bdaf-641a446c1d08',
 '80787be0-929a-4d6a-b6c8-f718a42a92bb']

## Vector Store: Chroma db

### Initialization

In [22]:
from langchain_chroma import Chroma

vector_store = Chroma(
    collection_name="ElectroTV",
    embedding_function=embedding_model,
    persist_directory="/home/abhishek/ad-workspace/chroma_db/ElectroTV/Naive",
)

### Add Chunks & ids

In [23]:
vector_store.add_documents(documents=chunks, ids=uuids)

['9c15b8e0-7f30-44af-9322-c76e261a0427',
 '4280af25-6325-4014-b140-78c1fb3cc861',
 'b0267033-753e-42f8-86bb-2f116c2d4fca',
 '3af6194a-473a-49cc-bdaf-641a446c1d08',
 '80787be0-929a-4d6a-b6c8-f718a42a92bb',
 'f22a4fd4-4475-4c31-8496-395d27057afd',
 '6cedb703-3eeb-4c24-b79d-7218f4d69275',
 '88093e8b-ea69-4828-b9e3-258038b8f5b4',
 'b165ab79-5ec9-4f4b-b8c6-766a1fae6427',
 'de38e28c-3398-48c2-a97b-68049b5f9af5',
 'fa9546c0-3fd5-444a-92c2-96e312b6ee7d',
 'c8068051-121c-48b6-b045-8cc2fda0b8c5',
 '4c0cff93-9cc9-4663-a667-2852d4826e56',
 '7e64df2b-ba9f-4d13-bac2-566a448f573e',
 '7d4774b7-29e5-4002-a1cd-1b1ff9223a4b',
 '1edb554c-4a6c-42b3-ba84-ccdd6cebfa53',
 '2ce19adf-4c33-4975-9634-450cce81e601',
 'b3441103-6499-4aa2-95a2-4fbd19a12e19',
 'a7050826-7531-42e0-b180-e7c49cdbbfd7',
 '683e9b35-1595-451f-84ac-a953385549aa',
 '44b4879a-263c-4336-a251-a66bbd5f1b62',
 '2b67b0aa-86f7-42b9-96f5-1e809293d30b',
 '752c040e-8d0f-45af-9bf1-ccb721d5dfeb',
 'a808f987-94a8-4317-87fc-4f7f684a6abe',
 '19dac7c9-ed74-

### Create Retriever

In [24]:
retriever = vector_store.as_retriever(search_kwargs={"k": 15})

## Test Similarily Search [OPTIONAL]

### Test Query

In [25]:
test_query = "Where is the Head office of ElectroTV"

### Similarity Search

In [26]:
similar_docs = vector_store.similarity_search(test_query,k=3)

In [27]:
for i in range(len(similar_docs)):
    print("=====================\n")
    print("Similar doc : " + str(i))
    print("=====================\n")
    print(similar_docs[i].page_content)


Similar doc : 0

Contact Us
ElectroTV welcomes inquiries from customers, partners, and business stakeholders regarding our 
products, services, and support offerings. Our corporate and regional offices are structured to ensure 
prompt assistance, transparent communication, and efficient resolution of queries. Whether you are 
seeking product information, sales support, or after-sales service, our teams are available through 
the contact details provided below.
Head Office (Corporate Headquarters)
ElectroTV Electronics Pvt. Ltd.
ElectroTV Tower, Plot No. 42,
Tech Park Avenue, Sector 18,
Gurugram ‚Äì 122015, Haryana, India
 Phone: +91-11-4567-8900üìû
 Email: corporate@electrotv.comüìß
Regional Branch Offices

Similar doc : 1

ElectroTV
About Us:
ElectroTV is a consumer electronics brand established with the singular purpose of redefining how  
modern households experience television and digital entertainment. From its inception, ElectroTV  
has focused on designing products that balan

## Naive RAG Pipeline

In [29]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser


prompt = ChatPromptTemplate.from_template("""
You are a helpful assistant.
Answer the question using ONLY the context below.
If you don't know the answer based on the context, say you don't know.
Please use bullet points & tables wherever possible in the answers

Context:
{context}

Question:
{question}
""")

naive_rag_chain = (
    {
        "context": retriever,
        "question": lambda x: x
    }
    | prompt
    | llm
    | StrOutputParser()
)


## Question and Answer

### Question-1

In [30]:
question = "Where is the head office of ElectroTV"
print(naive_rag_chain.invoke(question))

- The head office of ElectroTV is located at:
  - ElectroTV Tower, Plot No. 42,
  - Tech Park Avenue, Sector 18,
  - Gurugram ‚Äì 122015, Haryana, India

- Contact details:
  - Phone: +91-11-4567-8900
  - Email: corporate@electrotv.com


### Question-2

In [31]:
question = "How many regional officies does ElectroTV has and give me their contact numbers"
print(naive_rag_chain.invoke(question))

ElectroTV has 4 regional offices. Their contact numbers are:

| Region   | Contact Number       |
|----------|----------------------|
| North    | +91-120-678-2345     |
| South    | +91-80-5123-7788     |
| East     | +91-33-4098-1122     |
| West     | +91-22-4455-9900     |


### Question-3

In [33]:
question = "How many Televion models are launched by ElectroTV so far. List down all of them"
print(naive_rag_chain.invoke(question))

ElectroTV has launched the following television models:

| Model Name                  | Size/Type                         | Description/Notes                                |
|------------------------------|----------------------------------|-------------------------------------------------|
| ElectroTV E32 Smart         | Compact 32-inch smart TV       | Entry-level, smart features                     |
| ElectroTV E40 Smart         | Mid-size Full HD TV            | General smart TV                              |
| ElectroTV E43 Smart         | Family-oriented TV             | Designed for family entertainment               |
| ElectroTV E50 Pro           | Entry-level 4K TV               | Basic 4K model                                |
| ElectroTV E55 Pro+          | Premium 55-inch 4K TV          | High-end large format                        |
| ElectroTV E58 Vision        | Large-format TV                | Larger size, likely premium                   |

**Total models launched

### Question-4

In [34]:
question = "Which ElectroTV products are QLED televisions and what are their listed prices?"
print(naive_rag_chain.invoke(question))

- ElectroTV Q55 Ultra: 54,999‚Çπ
- ElectroTV Q65 Ultra: 69,999‚Çπ


### Question-5

In [35]:
question = "What is the phone number and email address of the ElectroTV Head Office?"
print(naive_rag_chain.invoke(question))

- Phone number of ElectroTV Head Office: +91-11-4567-8900üìû
- Email address of ElectroTV Head Office: corporate@electrotv.comüìß


### Question-6

In [36]:
question = "How many ElectroTV regional offices are there ? List them with contact details"
print(naive_rag_chain.invoke(question))

There are 4 ElectroTV regional offices. Their contact details are:

| Region       | Office Name                                   | Address                                                                 | Phone             | Email                          |
|--------------|----------------------------------------------|-------------------------------------------------------------------------|-------------------|-------------------------------|
| North        | ElectroTV Regional Office ‚Äì North            | 2nd Floor, Orion Business Center, Noida Sector 62, Uttar Pradesh ‚Äì 201309 | +91-120-678-2345  | north.sales@electrotv.com     |
| South        | ElectroTV Regional Office ‚Äì South            | Sigma Tech Plaza, 5th Floor, Whitefield Main Road, Bengaluru ‚Äì 560066  | +91-80-5123-7788  | south.support@electrotv.com   |
| West         | ElectroTV Regional Office ‚Äì West             | Apex Commercial Complex, Andheri East, Mumbai ‚Äì 400069                 | +91-22-4455-9900  | 

### Question-7

In [37]:
question = "Which ElectroTV models priced below ‚Çπ50,000 are available, and how many total units of these models were sold in 2024?"
print(naive_rag_chain.invoke(question))

ElectroTV models priced below ‚Çπ50,000 and their total units sold in 2024:

| Model               | Price (INR) | Units Sold in 2024 |
|---------------------|--------------|--------------------|
| ElectroTV E40 Smart | 19,999       | 26 (15 + 11)      |
| ElectroTV E43 Smart | 22,999       | 1                  |
| ElectroTV E32 Smart | 14,999       | 15                 |
| ElectroTV E50 Pro   | 34,999       | 52 (2 + 9 + 13 + 11 + 5) |

Total units sold of these models:

- **ElectroTV E40 Smart:** 15 + 11 = 26 units
- **ElectroTV E43 Smart:** 1 unit
- **ElectroTV E32 Smart:** 15 units
- **ElectroTV E50 Pro:** 2 + 9 + 13 + 11 + 5 = 40 units

**Total units sold across all these models:** 26 + 1 + 15 + 40 = **82 units**


### Question-8

In [38]:
question = "Which ElectroTV product generated the highest total revenue across all sales records?"
print(naive_rag_chain.invoke(question))

Based on the sales records, the ElectroTV product that generated the highest total revenue is:

| Product Name          | Total Revenue (INR) | Notes                                              |
|------------------------|---------------------|----------------------------------------------------|
| ElectroTV Q65 Ultra    | 3,494,938           | Sum of multiple sales records: 979,986 + 909,987 + 139,998 + 979,986 + 839,988 = 3,494,938 |

**Summary:**
- ElectroTV Q65 Ultra has the highest total revenue across all sales records with **INR 3,494,938**.


### Question-9

In [39]:
question = "What is the total revenue of Central region in 2024?"
print(naive_rag_chain.invoke(question))

Based on the provided data, the total revenue of the Central region in 2024 is:

- Sum of total_sales_inr from all listed transactions:

| Transaction ID | Total Sales (INR) |
|------------------|-------------------|
| ORD00400        | 149,990         |
| ORD00280        | 321,986         |
| ORD00496        | 252,989         |
| ORD00160        | 209,986         |
| ORD00240        | 59,996          |
| ORD00252        | 179,991         |
| ORD00274        | 252,989         |
| ORD00412        | 219,989         |
| ORD00028        | 44,997          |
| ORD00090        | 299,985         |
| ORD00451        | 44,997          |
| ORD00460        | 59,996          |
| ORD00490        | 149,990         |
| ORD00141        | 179,988         |
| ORD00260        | 171,996         |

- Calculating total:

149,990 + 321,986 + 252,989 + 209,986 + 59,996 + 179,991 + 252,989 + 219,989 + 44,997 + 299,985 + 44,997 + 59,996 + 149,990 + 179,988 + 171,996 = **3,994,938 INR**

**Answer:**
- The total r

### Question-10

In [40]:
question = "Which ElectroTV products are marketed for home cinema use?"
print(naive_rag_chain.invoke(question))

The ElectroTV products marketed for home cinema use are:

| Product Name                       | Description                                              |
|------------------------------------|----------------------------------------------------------|
| ElectroTV E65 Cinema               | Home cinema style TV                                   |
| ElectroTV E75 Cinema Max           | Home cinema TV with Ultra-large screen                |


### Question-11

In [41]:
question = "Which ElectroTV regional office should a customer in Bengaluru contact, and what are the contact details?"
print(naive_rag_chain.invoke(question))

A customer in Bengaluru should contact the South Region Office. The contact details are:

| **Office**                        | **Address**                                                                 | **Phone**             | **Email**                         |
|----------------------------------|---------------------------------------------------------------------------|----------------------|----------------------------------|
| ElectroTV Regional Office ‚Äì South | Sigma Tech Plaza, 5th Floor, Whitefield Main Road, Bengaluru ‚Äì 560066, Karnataka | +91-80-5123-7788üìû     | south.support@electrotv.comüìß   |


### Question-12

In [42]:
question = "For E32 Smart model, Give me the following details: Price, Features "
print(naive_rag_chain.invoke(question))

- Price: 14,999 INR
- Features:
  - Compact HD-ready TV for bedrooms and small living spaces
  - Optimized for streaming and cable TV
  - Energy-efficient design
  - Android TV
  - HD Ready Display
  - 20W Sound Output


### Question-13

In [43]:
question = "How many units of ElectroTV E32 Smart are sold in North region in the year 2024 "
print(naive_rag_chain.invoke(question))

- Total units sold of ElectroTV E32 Smart in North region in 2024:

| Order ID     | Date        | Units Sold |
|--------------|-------------|--------------|
| ORD00422     | 2024-11-25  | 10           |
| ORD00443     | 2024-12-03  | 7            |
| ORD00453     | 2024-11-22  | 5            |
| ORD00367     | 2024-04-26  | 6            |
| ORD00155     | 2024-08-29  | 9            |

- Sum of units sold:

10 + 7 + 5 + 6 + 9 = **37 units**

**Answer:** 37 units of ElectroTV E32 Smart are sold in North region in 2024.


### Question-14

In [44]:
question = "What are the differences between ElectroTV E55+ Pro and ElectroTV E58 Vision?"
print(naive_rag_chain.invoke(question))

Based on the provided context, there is no specific information comparing the ElectroTV E55+ Pro+ and ElectroTV E58 Vision models directly. However, I can summarize what is known about each:

**ElectroTV E55 Pro+**
- Description: Premium 55-inch 4K TV
- Page Reference: 5

**ElectroTV E58 Vision**
- Description: Large-format TV, with sales data indicating units sold and price
- Page Reference: 6, 8

**Known differences:**
| Feature                     | ElectroTV E55 Pro+                         | ElectroTV E58 Vision                         |
|-----------------------------|--------------------------------------------|----------------------------------------------|
| Screen Size                 | 55 inches (Pro+)                         | Larger format (specific size not provided, but larger than 55") |
| Resolution                  | 4K (Pro+)                                | Not explicitly specified in the context       |
| Product Positioning         | Premium, high-end              

### Question-15

In [45]:
question = "Which is the cheapest and costliest ElectroTV models. Mention the model names and prices?"
print(naive_rag_chain.invoke(question))

- Cheapest ElectroTV model:
  - Model Name: ElectroTV E32 Smart
  - Price: 14,999 INR

- Costliest ElectroTV model:
  - Model Name: ElectroTV E75 Cinema Max
  - Price: 84,999 INR
