<a href="https://colab.research.google.com/github/nw-tn/RAG-with-OpenAI/blob/main/RAG_with_OpenAI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
!pip install -r requirements.txt

Collecting langchain_openai (from -r requirements.txt (line 2))
  Downloading langchain_openai-0.3.17-py3-none-any.whl.metadata (2.3 kB)
Collecting pypdf (from -r requirements.txt (line 4))
  Downloading pypdf-5.5.0-py3-none-any.whl.metadata (7.2 kB)
Collecting langchain_community (from -r requirements.txt (line 5))
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting langchain_chroma (from -r requirements.txt (line 7))
  Downloading langchain_chroma-0.2.4-py3-none-any.whl.metadata (1.1 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain_community->-r requirements.txt (line 5))
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain_community->-r requirements.txt (line 5))
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain_community->-r requirements.txt (line 5))
  Downloading httpx_sse-0.4.0-py3-non

In [7]:
from openai import OpenAI
from langchain_openai import ChatOpenAI
from google.colab import drive, userdata
import os

In [6]:
drive.mount('/content/drive')

ROOT_DIR = '/content/drive/My Drive/Experimenting with Langchain/RAG/'
os.chdir(ROOT_DIR)

Mounted at /content/drive


In [8]:
openai_key = userdata.get('OpenAIKey')

client = OpenAI(api_key=openai_key)

In [9]:
from langchain_community.document_loaders import PyPDFLoader
loader = PyPDFLoader('/content/drive/My Drive/Experimenting with Langchain/RAG/Tommys_Kitchen_Business_Overview.pdf')

docs = loader.load()
print(docs[0])

page_content='Tommy's Kitchen - Business Overview
1. Business Name and Logo
Business Name: Tommy's Kitchen
Logo: A stylized chef's hat over a rolling pin, with warm earthy tones representing freshness and home-baked
goodness.
2. Business Overview
Tommy's Kitchen is a privately owned bakery established in 2022. We specialize in artisan baked goods
made with locally sourced, organic ingredients. Our bakery combines traditional recipes with a modern twist
to serve a diverse and health-conscious community.
3. Products and Services
We offer a variety of baked goods including sourdough bread, croissants, muffins, cookies, and cakes.
Custom cake orders, catering for events, and baking classes are also part of our services.
4. Unique Selling Proposition
What sets Tommy's Kitchen apart is our commitment to quality and community. All products are handmade
daily with no preservatives. We also engage with local farmers for ingredients, supporting sustainability.
5. Target Market
Our target market 

In [10]:
print(docs[1])

page_content='Tommy's Kitchen - Business Overview
Tommy's Kitchen is located in the heart of Bristol, UK. We operate from a cozy storefront with walk-in
services, and also offer local delivery within a 10-mile radius. Our products are available through our website
and select local cafes and grocery stores.
7. Production Process
All baking is done in-house in our commercial kitchen. The process begins with sourcing fresh ingredients
each morning. Our team of experienced bakers prepares the doughs and batters, followed by baking, cooling,
and packaging. Strict hygiene and quality control standards are maintained throughout.' metadata={'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250420100648', 'source': '/content/drive/My Drive/Experimenting with Langchain/RAG/Tommys_Kitchen_Business_Overview.pdf', 'total_pages': 2, 'page': 1, 'page_label': '2'}


In [11]:
from langchain.text_splitter import CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
data = text_splitter.split_documents(docs)

In [12]:
from langchain_openai import OpenAIEmbeddings
from langchain_chroma import Chroma


embedding_function = OpenAIEmbeddings(api_key=openai_key, model='text-embedding-3-small')

In [13]:
vecstore = Chroma.from_documents(
    data,
    embedding=embedding_function,
    persist_directory=ROOT_DIR
)

In [14]:
vecretriever = vecstore.as_retriever(
    search_type='similarity',
    search_kwargs={'k': 2}
)

In [15]:
docs = vecretriever.invoke("Give me information about Tommy's kitchen products?")
print(docs)

[Document(id='49952bfd-6252-4f93-92dc-7fb813363ae0', metadata={'page_label': '1', 'source': '/content/drive/My Drive/Experimenting with Langchain/RAG/Tommys_Kitchen_Business_Overview.pdf', 'page': 0, 'total_pages': 2, 'producer': 'PyFPDF 1.7.2 http://pyfpdf.googlecode.com/', 'creator': 'PyPDF', 'creationdate': 'D:20250420100648'}, page_content="Tommy's Kitchen - Business Overview\n1. Business Name and Logo\nBusiness Name: Tommy's Kitchen\nLogo: A stylized chef's hat over a rolling pin, with warm earthy tones representing freshness and home-baked\ngoodness.\n2. Business Overview\nTommy's Kitchen is a privately owned bakery established in 2022. We specialize in artisan baked goods\nmade with locally sourced, organic ingredients. Our bakery combines traditional recipes with a modern twist\nto serve a diverse and health-conscious community.\n3. Products and Services\nWe offer a variety of baked goods including sourdough bread, croissants, muffins, cookies, and cakes.\nCustom cake orders, c

In [16]:
from langchain_core.prompts import PromptTemplate

TEMPLATE = """
  Answer questions cooncerning Tommy's kitchen

  Tommy's kitchen:
  {tom_kitchen}
"""

In [17]:
llm = ChatOpenAI(model='gpt-4o-mini',
                 api_key=openai_key,
                 temperature=0)

In [18]:
prompt_template = PromptTemplate.from_template(template=TEMPLATE)

In [19]:
prompt_template

PromptTemplate(input_variables=['tom_kitchen'], input_types={}, partial_variables={}, template="\n  Answer questions cooncerning Tommy's kitchen\n\n  Tommy's kitchen:\n  {tom_kitchen}\n")

In [20]:
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory

In [21]:
memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)

rag_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=vecretriever,
    memory=memory,
    verbose=False
)

  memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)


In [22]:
result = rag_chain.run("Give me information about Tommy's kitchen's products")
print(result)

  result = rag_chain.run("Give me information about Tommy's kitchen's products")


Tommy's Kitchen offers a variety of baked goods including sourdough bread, croissants, muffins, cookies, and cakes. Additionally, they provide custom cake orders, catering for events, and baking classes as part of their services. All products are handmade daily with no preservatives, using locally sourced, organic ingredients.


In [23]:
result = rag_chain.run("Tell me the location")
print(result)

Tommy's Kitchen is located in the heart of Bristol, UK.
