## Ingesting PDF

In [1]:
from langchain_community.document_loaders import UnstructuredPDFLoader

In [7]:
local_path = "../pdf_files/Owners_Manual-Ram_1500_25_Crew_Cab.pdf"

# Local PDF file uploads
if local_path:
  loader = UnstructuredPDFLoader(file_path=local_path)
  data = loader.load()
else:
  print("Upload a PDF file")

In [3]:
from IPython.display import display, Markdown
def view_text_in_markdown(page_content):
    display(Markdown(page_content))

## Vector Embeddings

In [8]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [9]:
# Split and chunk 
text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)
chunks = text_splitter.split_documents(data)

In [10]:
chunks

[Document(metadata={'source': '../pdf_files/Owners_Manual-Ram_1500_25_Crew_Cab.pdf'}, page_content="ROADSIDE ASSISTANCE\n\n24 HOURS, 7 DAYS A WEEK AT YOUR SERVICE.\n\nCALL 1-800-521-2779 OR VISIT CHRYSLER.RSAHELP.COM (USA)\n\nCALL 1-800-363-4869 OR VISIT FCA.ROADSIDEAID.COM (CANADA)\n\nSERVICES: Flat Tire Service, Out Of Gas/Fuel Delivery, Battery Jump Assistance, Lockout Service and Towing Service.\n\nFCA US LLC reserves the right to modify the terms or discontinue the Roadside Assistance Program at any time. The Roadside Assistance Program is subject to restrictions and conditions of use, that are determined solely by FCA US LLC.\n\nPlease see the Customer Assistance chapter in this Owner's Manual for further information."),
 Document(metadata={'source': '../pdf_files/Owners_Manual-Ram_1500_25_Crew_Cab.pdf'}, page_content="This Owner's Manual illustrates and describes the operation of features and equipment that are either standard or optional on this vehicle. This manual may also in

In [11]:
# Add to vector database
vector_db = Chroma.from_documents(
    documents=chunks, 
    embedding=OllamaEmbeddings(model="nomic-embed-text", show_progress=True),
    collection_name="local-rag"
)

OllamaEmbeddings: 100%|██████████| 1468/1468 [03:21<00:00,  7.28it/s]


## Retrieval

In [13]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In [14]:
# LLM from Ollama
local_model = "mistral"
llm = ChatOllama(model=local_model)

In [15]:
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate five
    different versions of the given user question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [16]:
retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

# RAG prompt
template = """Answer the question based ONLY on the following context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [17]:
retrieve_docs = retriever.invoke('Where is located the hazard flashers button?') # get relevant documents

OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.37s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  7.20it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 22.32it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  7.61it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 13.44it/s]


In [16]:
# small doc
retrieve_docs

 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content="not in use (i.e., cellular devices, etc.). Eventually, if plugged in long enough without engine operation, the vehicle’s battery will discharge sufficiently to degrade battery life and/or prevent the engine from starting.\n\nREFUELING IN EMERGENCY\n\nThe vehicle is equipped with a refueling funnel for a capless fuel system. The refueling funnel is located under the passenger's seat along with the jack and\n\n294 IN CASE OF EMERGENCY\n\ntools. If refueling is necessary, while using an approved gas can, insert the refueling funnel into the filler neck opening. Take care to open both flappers with the funnel to avoid spills.\n\nFuel Funnel Location\n\nNOTE:"),
 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content="Shift the transmission into PARK (P). Turn the ignition OFF.\n\noa FWD\n\nBlock both front and rear of the wheel diagonally opposite of the jacking position. 

In [18]:
# big doc
retrieve_docs

 Document(metadata={'source': '../pdf_files/Owners_Manual-Ram_1500_25_Crew_Cab.pdf'}, page_content='The auxiliary switches manage the relays that power four or six blunt cut wires. These wires are located under the hood to the right, near the battery.\n\nIn addition to the four or six auxiliary switch wires, a fused battery wire and ignition wire are also found in this location.\n\nSERVICING AND MAINTENANCE 327\n\nA kit of splices and heat shrink tubing are provided with the auxiliary switches to aid in the connection/ installation of your electrical devices.\n\nFuse And Wire Color Chart\n\nNOTE:\n\nFuses for the auxiliary switches can be found in the auxiliary Power Distribution Center (PDC), located in the engine compartment toward the front of the vehicle, in front of the main PDC. Remove upper shield to access. If equipped, additional auxiliary switch fuses will be located in the main PDC.'),
 Document(metadata={'source': '../pdf_files/Owners_Manual-Ram_1500_25_Crew_Cab.pdf'}, page

In [19]:
len(retrieve_docs)

11

In [20]:
view_text_in_markdown(retrieve_docs[1].page_content)

SAFETY

281

282 IN CASE OF EMERGENCY

IN CASE OF EMERGENCY

HAZARD WARNING FLASHERS

The Hazard Warning Flashers button is located on the upper switch bank just below the radio.

Hazard Warning Flashers Button

NOTE:

If your vehicle is equipped with a 12-inch Uconnect display, the Hazard Warning Flashers button is located above the display.

Hazard Warning Flashers Button with 12-inch display

NOTE:

If your vehicle is equipped with a 14.5-inch Uconnect display, the Hazard Warning Flashers button is located to the left of the display.

Hazard Warning Flashers Button with 14.5-inch display

In [22]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [23]:
resposta = chain.invoke("Where is located the hazard flashers button?")
view_text_in_markdown(resposta)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  2.48it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 13.83it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.97it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 15.45it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  9.30it/s]


 The Hazard Warning Flashers button can be found on the upper switch bank just below the radio for vehicles not equipped with a 12-inch or 14.5-inch Uconnect display. For vehicles equipped with a 12-inch Uconnect display, the Hazard Warning Flashers button is located above the display. If your vehicle has a 14.5-inch Uconnect display, the button is located to the left of the display.

In [31]:
resposta = chain.invoke("Please list all the support centers that assist button can connect")
view_text_in_markdown(resposta)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.26it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 14.69it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.21it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 10.15it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.54it/s]


1. Roadside Assistance
  2. Brand Connect Customer Care (If available)
  3. Vehicle Customer Care
  4. Uconnect Customer Care

In [24]:
resposta = chain.invoke("Please list all the support centers that assist button can connect")
view_text_in_markdown(resposta)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  3.03it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 52.69it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.29it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 12.23it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.57it/s]


1. Roadside Assistance
  2. Brand Connect Customer Care (If available)
  3. Vehicle Customer Care
  4. Uconnect Customer Care

In [25]:
resposta = chain.invoke("This is an owner manual of a vehicle. Can you specify which vehicle?")
view_text_in_markdown(resposta)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.54it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 13.80it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.99it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 10.89it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  7.72it/s]


 The document does not specify an exact model or make of the vehicle, but based on the source file name, it appears to be a Ram 1500 25 Crew Cab vehicle.

In [26]:
resposta = chain.invoke("Can you specify which vehicle?")
view_text_in_markdown(resposta)

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  6.60it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 10.78it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  4.86it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 12.57it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.50it/s]


 I can't definitively say which vehicle as the provided information does not contain enough details to identify a specific model. The given data are excerpts from the owner's manual of a Ram 1500 truck, but that's only one possibility among many vehicles with similar manuals.

In [27]:
# Delete all collections in the db
vector_db.delete_collection()