# Retrieval augmented generation (RAG) prototype for home appliance maintenance

Prototype of RAG with OpenAI and Pinecone to get maintenance instructions from provided PDF manual. 

Uses:
- langchain
- Pincecone
- OpenAI GPT
- Python libraries for text extraction from PDF

Sources - mixed two tutorials:
- [https://medium.com/@anderson.riciamorim/a-quick-guide-to-use-your-own-data-in-gpt-with-retrieval-augmented-generation-73f3e9d54bcd]
- but libraries not cooperating, so leaning into [https://python.langchain.com/v0.2/docs/integrations/vectorstores/pinecone/]

In [None]:
# Using venv in vscode
# open new terminal
# python3 -m venv venv
# then Cmd-Shift-P and select Python: Select Interpreter

In [None]:
%pip install --upgrade --quiet  \
    langchain-pinecone \
    langchain-openai \
    langchain \
    langchain-community \
    pypdf \
    python-dotenv

In [None]:
import os
from dotenv import load_dotenv, find_dotenv

load_dotenv(find_dotenv(), override=True)

In [5]:
from langchain_community.document_loaders import TextLoader
from langchain_openai import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from pypdf import PdfReader

file = 'Honeywell-HEV320-Cool-Mist-Humidifier-Manual.pdf'
reader = PdfReader(file)
text = ""
for page in reader.pages:
    text += page.extract_text() + "\n"
text_splitter = RecursiveCharacterTextSplitter(
    separators=["\n\n", "\n", " ", ""],
    chunk_size=1000,
    chunk_overlap=200)
chunks = text_splitter.create_documents(texts=[text])

embeddings = OpenAIEmbeddings()

In [None]:
from pinecone import Pinecone, ServerlessSpec
from langchain_pinecone import PineconeVectorStore
import time

pc = Pinecone(api_key=os.environ.get("PINECONE_API_KEY"))

index_name = "my-index"

if index_name not in [index_info["name"] for index_info in pc.list_indexes()]:
    pc.create_index(
        name=index_name,
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(cloud="aws", region="us-east-1"),
    )
    while not pc.describe_index(index_name).status["ready"]:
        time.sleep(1)

index = pc.Index(index_name)

docsearch = PineconeVectorStore.from_documents(chunks, embeddings, index_name=index_name)

In [13]:
query = "How often should I clean my humidifier?"
docs = docsearch.similarity_search(query)
print(docs[0].page_content)

6 7CARING FOR YOUR HUMIDIFIER  (Continued)
To keep your humidifier running efficiently, clean it regularly. Weekly cleaning is 
recommended. All maintenance should be done in the kitchen or bathroom on a water resistant surface near a faucet. Do not wash any components of this humidifier  in a dishwasher.
To properly clean your humidifier we recommend the separate processes of Scale 
Removal and Disinfecting. These two processes must be done separately.
Before CleaningCARING FOR YOUR HUMIDIFIERCARING FOR YOUR FILTER  (Continued)
To prolong the life of your Wicking Filter, turn it over each time you fill the Water Tank. This will keep the top of the Filter from drying out and will help the Filter to age more evenly.
If you notice the Filter getting hard, you may soak it in cool water to 
loosen mineral buildup. This will temporarily improve the performance of the Wicking Filter until a replacement Filter is purchased.


In [14]:
query = "Do I need to change the filter?"
docs = docsearch.similarity_search(query)
print(docs[0].page_content)

like the Honeywell HHM10 or H10C, available at retailers or on HoneywellPluggedIn.com.USING YOUR HUMIDIFIER Step 1
CARING FOR YOUR FILTER
It is recommended you change your Wicking Filter every 30-60 days depending on water quality and usage. If you have hard water you may need to replace your filter more frequently.
Change your filter if you notice:• The filter is hard and crusty• The filter starts to give off an odor• Moisture output is decreasedBe sure to use only genuine Honeywell Replacement  
Filters which are Protec
® antimicrobially treated. This  
helps prevent the migration of mold, algae and bacteria  
on the filter.Replacement Wicking  
Filter Honeywell  
HC-888 Series Type C
 Step 2  Step 3


In [31]:
from langchain import PromptTemplate
import langchain
from langchain.chains.retrieval_qa.base import RetrievalQA

from langchain.chat_models import ChatOpenAI
# import openai
# openai.api_key=os.getenv("OPENAI_KEY")

llm = ChatOpenAI(model_name="gpt-4o", temperature=0, openai_api_key=os.getenv("OPENAI_KEY"))

# llm = OpenAI(temperature=0, openai_api_key=os.getenv("OPENAI_KEY"), model='gpt-4')
embedding_generator = OpenAIEmbeddings(openai_api_key=os.getenv("OPENAI_KEY"))

# prompt template taken from: https://github.com/smatiolids/astra-agent-memory/tree/main

prompt_template = """
Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES"). 
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer. 


QUESTION: {question}
=========
{summaries}
=========
FINAL ANSWER:"""
PROMPT = PromptTemplate(
    template=prompt_template, input_variables=["summaries", "question"]
)

# Create a "RetrievalQA" chain
langchain.verbose = False
chainSim = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=docsearch.as_retriever(),
    chain_type_kwargs={
        'prompt': PROMPT,
        'document_variable_name': 'summaries'
    }
)

In [30]:
#gpt3.5-turbo
QUERY = 'What maintenance is required for the humidifier?'

# Run it and print results
responseSim = chainSim.run(QUERY)
print(responseSim)

To maintain the humidifier, it is recommended to clean it weekly. This involves separate processes of Scale Removal and Disinfecting. Additionally, it is important to turn over the Wicking Filter each time the Water Tank is filled to prevent drying out. If the Filter becomes hard, soaking it in cool water can help loosen mineral buildup temporarily. It is crucial not to operate the humidifier without water, regularly clean it, and avoid using it outdoors. Attempting to repair or adjust any electrical or mechanical functions on the humidifier will void the warranty. For residential use only. 

SOURCES: The provided document on caring for the humidifier.


In [32]:
#gpt-4o
QUERY = 'What maintenance is required for the humidifier?'

# Run it and print results
responseSim = chainSim.run(QUERY)
print(responseSim)

To maintain your humidifier efficiently, follow these steps:

1. **Regular Cleaning**: Clean the humidifier weekly to ensure it runs efficiently. Perform the cleaning in the kitchen or bathroom on a water-resistant surface near a faucet. Do not wash any components in a dishwasher.

2. **Scale Removal and Disinfecting**: These two processes should be done separately to properly clean your humidifier.

3. **Filter Maintenance**: To prolong the life of the Wicking Filter, turn it over each time you fill the Water Tank. If the filter becomes hard, soak it in cool water to loosen mineral buildup, which will temporarily improve its performance until a replacement is purchased.

4. **General Precautions**:
   - Do not operate the humidifier without water. Turn off and unplug the unit when the tank is empty.
   - Do not attempt to repair or adjust any electrical or mechanical functions on the humidifier, as this will void your warranty.
   - The humidifier is intended for indoor residential us