## Naive RAG

### Load API Keys

In [1]:
import os
from dotenv import load_dotenv

OPENAI_API_KEY=os.getenv("OPENAI_API_KEY")

### Setup Langsmith Tracking and API Key

In [2]:
os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGSMITH_API_KEY"]=os.getenv("LANGSMITH_API_KEY")
os.environ["LANGCHAIN_PROJECT"]="NAIVE_RAG"

## Load LLM model from OpenAI

In [3]:
from langchain_openai import ChatOpenAI


llm = ChatOpenAI(model="gpt-4.1-nano",
                    api_key=OPENAI_API_KEY,
                    temperature=0.3,
                    max_tokens=512 )

### Test LLM 

In [4]:
test_llm_response=llm.invoke("What is Large Language Models")
test_llm_response.content

"Large Language Models (LLMs) are advanced artificial intelligence systems designed to understand, generate, and manipulate human language. They are built using deep learning techniques, particularly neural networks with many layers, which enable them to learn complex patterns in vast amounts of text data.\n\nKey characteristics of LLMs include:\n\n- **Scale:** They contain billions or even trillions of parameters, allowing them to capture nuanced language patterns.\n- **Training Data:** They are trained on extensive datasets from books, websites, articles, and other text sources to develop a broad understanding of language.\n- **Capabilities:** LLMs can perform a variety of tasks such as text generation, translation, summarization, question-answering, and more.\n- **Examples:** Popular LLMs include OpenAI's GPT series (like GPT-3 and GPT-4), Google's BERT, and Meta's LLaMA.\n\nOverall, LLMs are powerful tools that have significantly advanced natural language processing (NLP), enabling

## Load Text embedding model from OpenAI

In [5]:
from langchain_openai import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(
    model="text-embedding-3-small",
)

### Test Embedding model

In [6]:
embedding_vector=embedding_model.embed_query("What is Large Language Models")

In [7]:
len(embedding_vector)

1536

## Load Documents

### CSV Loader

In [8]:
from langchain_community.document_loaders import CSVLoader

loader = CSVLoader(file_path="sample_docs/ElectroTV_Sales_Report_2024.csv")

csv_data = loader.load()

In [9]:
print(csv_data[0].page_content)

order_id: ORD00165
date: 2024-11-02
product_name: ElectroTV E32 Smart
units_sold: 2
unit_price_inr: 14999
total_sales_inr: 29998
sales_region: Central
sales_channel: Online


### PDF Loader

In [10]:
from langchain_community.document_loaders import PyPDFLoader

loader=PyPDFLoader(file_path="sample_docs/ElectroTV.pdf")

pdf_data = loader.load()

In [11]:
print(pdf_data[0].metadata['source'])

sample_docs/ElectroTV.pdf


### Merge CSV and PDF data

In [12]:
documents = csv_data + pdf_data

In [13]:
documents[0]

Document(metadata={'source': 'sample_docs/ElectroTV_Sales_Report_2024.csv', 'row': 0}, page_content='order_id: ORD00165\ndate: 2024-11-02\nproduct_name: ElectroTV E32 Smart\nunits_sold: 2\nunit_price_inr: 14999\ntotal_sales_inr: 29998\nsales_region: Central\nsales_channel: Online')

In [14]:
documents[-1]

Document(metadata={'producer': 'LibreOffice 24.2', 'creator': 'Writer', 'creationdate': '2025-12-25T14:00:02+05:30', 'source': 'sample_docs/ElectroTV.pdf', 'total_pages': 11, 'page': 10, 'page_label': '11'}, page_content='Andheri East,\nMumbai ‚Äì 400069, Maharashtra\n Phone: +91-22-4455-9900üìû\n Email: west.sales@electrotv.comüìß\nEast Region Office\nElectroTV Regional Office ‚Äì East\nInfinity IT Park, Block B,\nSalt Lake Sector V ,\nKolkata ‚Äì 700091, West Bengal\n Phone: +91-33-4098-1122üìû\n Email: east.support@electrotv.comüìß\nCustomer Care & Service Support\nFor product installation, troubleshooting, warranty information, and service requests, customers \nmay contact our centralized support team.\n Toll-Free: 1800-555-ETV1 (1800-555-3881)üìû\n Email: support@electrotv.comüìß\n Website: www.electrotv.com üåê (fictional)\n11')

## Document Splitting

In [15]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=100)

chunks = text_splitter.split_documents(documents)

In [16]:
len(chunks)

542

In [17]:
print(chunks[0].page_content)

order_id: ORD00165
date: 2024-11-02
product_name: ElectroTV E32 Smart
units_sold: 2
unit_price_inr: 14999
total_sales_inr: 29998
sales_region: Central
sales_channel: Online


In [18]:
print(chunks[-1].page_content)

may contact our centralized support team.
 Toll-Free: 1800-555-ETV1 (1800-555-3881)üìû
 Email: support@electrotv.comüìß
 Website: www.electrotv.com üåê (fictional)
11


### Add ids to chunks

In [19]:
from uuid import uuid4

uuids = [str(uuid4()) for _ in range(len(chunks))]

In [20]:
len(uuids)

542

In [21]:
uuids[:5]

['15f83259-cc7c-4649-abc8-540154f85e84',
 '80e014d2-2c5d-4575-b551-6ede1b764e4f',
 'a273af95-2447-48c3-a168-a56f89526d8b',
 'c872379a-92e9-49cc-8243-c5feb23406b1',
 'e0ef238b-159c-4321-8b37-80c6a47f17a3']

## Vector Store: Chroma db

### Initialization

In [22]:
from langchain_chroma import Chroma

vector_store = Chroma(
    collection_name="ElectroTV",
    embedding_function=embedding_model,
    persist_directory="/home/abhishek/ad-workspace/chroma_db/ElectroTV",
)

### Add Chunks & ids

In [23]:
vector_store.add_documents(documents=chunks, ids=uuids)

['15f83259-cc7c-4649-abc8-540154f85e84',
 '80e014d2-2c5d-4575-b551-6ede1b764e4f',
 'a273af95-2447-48c3-a168-a56f89526d8b',
 'c872379a-92e9-49cc-8243-c5feb23406b1',
 'e0ef238b-159c-4321-8b37-80c6a47f17a3',
 '871ad147-3c09-4cf6-9db3-4c665295b57f',
 '15f88d60-89cd-4d7f-80ff-6d4b630c0c8f',
 '135391e4-42e4-42c1-927e-8dad5293a246',
 '03622023-75a7-4e5c-9468-85faa029cdf0',
 '82abdec6-3cef-4ef8-9ccd-1d10aa4451de',
 'c1655047-4702-4a00-a82b-b0dabcda414f',
 '70649296-febb-4b13-9fe1-f1e61ae71a74',
 'b6c2eafe-0a0e-428a-9b49-5fa04145ed19',
 '44c5889b-aac2-49b5-b76f-313cbbcde545',
 '1b61ffed-5cd9-40c5-9e1a-d9c9b4e6023d',
 '2e40aaf1-092b-4a1a-870e-0c034c89e103',
 '47836ca9-69f2-4da0-9f7e-17de66e40949',
 'ee28d26f-399f-4b43-a537-7b1bea3c9ea9',
 '696c141e-951d-4421-9d1a-b077054c40ee',
 '66b703bc-7285-4a29-800b-8216ab8a660a',
 '5f6745a7-4faa-4b2e-841d-5a5a43e992bd',
 'b1eaa2be-fb90-4dc3-9ae3-1dc33e024c77',
 '5d55c85d-5b74-417f-a941-35e0a40a2005',
 '101d1a09-ca5f-431d-8053-8a7df5be1704',
 'bec5787e-80cb-

### Create Retriever

In [45]:
retriever = vector_store.as_retriever(search_kwargs={"k": 15})

## Test Similarily Search [OPTIONAL]

### Test Query

In [25]:
test_query = "Where is the Head office of ElectroTV"

### Similarity Search

In [26]:
similar_docs = vector_store.similarity_search(test_query,k=3)

In [27]:
for i in range(len(similar_docs)):
    print("=====================\n")
    print("Similar doc : " + str(i))
    print("=====================\n")
    print(similar_docs[i].page_content)


Similar doc : 0

Contact Us
ElectroTV welcomes inquiries from customers, partners, and business stakeholders regarding our 
products, services, and support offerings. Our corporate and regional offices are structured to ensure 
prompt assistance, transparent communication, and efficient resolution of queries. Whether you are 
seeking product information, sales support, or after-sales service, our teams are available through 
the contact details provided below.
Head Office (Corporate Headquarters)

Similar doc : 1

the contact details provided below.
Head Office (Corporate Headquarters)
ElectroTV Electronics Pvt. Ltd.
ElectroTV Tower, Plot No. 42,
Tech Park Avenue, Sector 18,
Gurugram ‚Äì 122015, Haryana, India
 Phone: +91-11-4567-8900üìû
 Email: corporate@electrotv.comüìß
Regional Branch Offices
North Region Office
ElectroTV Regional Office ‚Äì North
2nd Floor, Orion Business Center,
Noida Sector 62, Uttar Pradesh ‚Äì 201309
 Phone: +91-120-678-2345üìû
 Email: north.sales@electrotv

## Naive RAG Pipeline

In [28]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser


prompt = ChatPromptTemplate.from_template("""
You are a helpful assistant.
Answer the question using ONLY the context below.
If you don't know the answer based on the context, say you don't know.

Context:
{context}

Question:
{question}
""")

naive_rag_chain = (
    {
        "context": retriever,
        "question": lambda x: x
    }
    | prompt
    | llm
    | StrOutputParser()
)


## Question and Answer

### Question-1

In [46]:
question = "Where is the head office of ElectroTV"
print(naive_rag_chain.invoke(question))

The head office of ElectroTV is located at ElectroTV Tower, Plot No. 42, Tech Park Avenue, Sector 18, Gurugram ‚Äì 122015, Haryana, India.


### Question-2

In [47]:
question = "How many regional officies does ElectroTV has and give me their contact numbers"
print(naive_rag_chain.invoke(question))

ElectroTV has three regional offices. Their contact numbers are:

- North Region Office: +91-120-678-2345
- South Region Office: +91-80-5123-7788
- West Region Office: (Contact number not fully provided in the context)


### Question-3

In [48]:
question = "How many Televion models are launched by ElectroTV so far. List down all of them"
print(naive_rag_chain.invoke(question))

Based on the provided context, two television models are mentioned as launched by ElectroTV:
1. ElectroTV E43 Smart
2. ElectroTV E50 Pro


### Question-4

In [49]:
question = "Which ElectroTV products are QLED televisions and what are their listed prices?"
print(naive_rag_chain.invoke(question))

The ElectroTV products that are QLED televisions are:

- ElectroTV Q55 Ultra, priced at 54,999‚Çπ
- ElectroTV Q65 Ultra, priced at 69,999‚Çπ


### Question-5

In [50]:
question = "What is the phone number and email address of the ElectroTV Head Office?"
print(naive_rag_chain.invoke(question))

The phone number of the ElectroTV Head Office is +91-11-4567-8900, and the email address is corporate@electrotv.com.


### Question-6

In [51]:
question = "Which ElectroTV product is identified as the flagship model, and what feature justifies this positioning?"
print(naive_rag_chain.invoke(question))

I don't know.


### Question-7

In [52]:
question = "Which ElectroTV models priced below ‚Çπ50,000 are available, and how many total units of these models were sold in 2024?"
print(naive_rag_chain.invoke(question))

The ElectroTV models priced below ‚Çπ50,000 that are available are the ElectroTV E40 Smart and the ElectroTV E50 Pro. 

Total units sold in 2024:
- ElectroTV E40 Smart: 15 units
- ElectroTV E50 Pro: 2 units (from May 8) + 13 units (from December 14) = 15 units

Total units of these models sold in 2024: 15 + 15 = 30 units.


### Question-8

In [53]:
question = "Which ElectroTV product generated the highest total revenue across all sales records?"
print(naive_rag_chain.invoke(question))

The ElectroTV product that generated the highest total revenue across all sales records is the ElectroTV E55 Pro+.


### Question-9

In [54]:
question = "Which sales channel contributed the most revenue for premium-priced ElectroTV televisions (price above ‚Çπ60,000)?"
print(naive_rag_chain.invoke(question))

Based on the provided data, the premium-priced ElectroTV televisions (price above ‚Çπ60,000) are:

- ElectroTV Q65 Ultra with a unit price of ‚Çπ69,999 and total sales of ‚Çπ979,986 (West region, online channel).

Since only one such product is listed, the sales channel contributing the most revenue for these premium-priced televisions is the **Online** channel.


### Question-10

In [56]:
question = "Which ElectroTV products are marketed for home cinema use?"
print(naive_rag_chain.invoke(question))

The ElectroTV products marketed for home cinema use are the ElectroTV E65 Cinema and the ElectroTV E75 Cinema Max.


### Question-11

In [57]:
question = "Which ElectroTV regional office should a customer in Bengaluru contact, and what are the contact details?"
print(naive_rag_chain.invoke(question))

A customer in Bengaluru should contact the South Region Office. The contact details are:

ElectroTV Regional Office ‚Äì South  
Sigma Tech Plaza, 5th Floor,  
Whitefield Main Road,  
Bengaluru ‚Äì 560066, Karnataka  
Phone: +91-80-5123-7788üìû  
Email: south.support@electrotv.comüìß


### Question-12

In [58]:
question = "For E32 Smart model, Give me the following details: Price, Features "
print(naive_rag_chain.invoke(question))

Price: 14,999 INR  
Features: Android TV, HD Ready Display, 20W Sound Output


### Question-13

In [59]:
question = "How many units of ElectroTV E32 Smart are sold in Northern region in the year 2024 "
print(naive_rag_chain.invoke(question))

I don't know.


### Question-14

In [61]:
question = "What are the differences between ElectroTV E55+ Pro and ElectroTV E58 Vision?"
print(naive_rag_chain.invoke(question))

I don't know.


### Question-15

In [62]:
question = "Which is the cheapest and costliest ElectroTV models. Mention the model names and prices?"
print(naive_rag_chain.invoke(question))

The cheapest ElectroTV model is the ElectroTV E32 Smart, priced at 14,999‚Çπ.  
The costliest ElectroTV model mentioned is the ElectroTV E50 Pro, but the price is not provided in the given context.
