## Ingesting PDF

In [None]:
%pip install --q unstructured langchain
%pip install --q "unstructured[all-docs]"

In [1]:
from langchain_community.document_loaders import UnstructuredPDFLoader

In [2]:
local_path = "../pdf_files/owner_manual_p283-p300.pdf"

# Local PDF file uploads
if local_path:
  loader = UnstructuredPDFLoader(file_path=local_path)
  data = loader.load()
else:
  print("Upload a PDF file")

  from .autonotebook import tqdm as notebook_tqdm


In [19]:
from IPython.display import display, Markdown
def view_text_in_markdown(page_content):
    display(Markdown(page_content))

## Vector Embeddings

In [None]:
!ollama pull nomic-embed-text

In [None]:
!ollama pull mistral

In [None]:
!ollama list

In [None]:
%pip install --q chromadb
%pip install --q langchain-text-splitters

In [3]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma

In [4]:
# Split and chunk 
text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100)
chunks = text_splitter.split_documents(data)

In [35]:
chunks

 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content="NOTE:\n\n@ Your vehicle may be transmitting data as authorized by the subscriber >> page 357.\n\n@ The ASSIST and SOS buttons will only function if you are connected to an operable LTE (voice/data) or 4G (data) network, which comes as a built-in function. Other services will only be operable if your Brand Connect service is active and you are connected to an operable 4G (data) network.\n\nASSIST Call\n\nThe ASSIST button is used to automatically connect you to any one of the following support centers:\n\n@ Roadside Assistance - If you get a flat tire, or need a tow, just push the ASSIST button, select the Roadside Assistance button to be connected to someone who can help. Roadside Assistance will know what vehicle you're driving and its location. Additional fees may apply for roadside Assistance."),
 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content='@ Brand Conne

In [5]:
# Add to vector database
vector_db = Chroma.from_documents(
    documents=chunks, 
    embedding=OllamaEmbeddings(model="nomic-embed-text",show_progress=True),
    collection_name="local-rag"
)

OllamaEmbeddings: 100%|██████████| 60/60 [00:10<00:00,  5.76it/s]


## Retrieval

In [7]:
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_community.chat_models import ChatOllama
from langchain_core.runnables import RunnablePassthrough
from langchain.retrievers.multi_query import MultiQueryRetriever

In [8]:
# LLM from Ollama
local_model = "mistral"
llm = ChatOllama(model=local_model)

In [9]:
QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template="""You are an AI language model assistant. Your task is to generate five
    different versions of the given user question to retrieve relevant documents from
    a vector database. By generating multiple perspectives on the user question, your
    goal is to help the user overcome some of the limitations of the distance-based
    similarity search. Provide these alternative questions separated by newlines.
    Original question: {question}""",
)

In [10]:
retriever = MultiQueryRetriever.from_llm(
    vector_db.as_retriever(), 
    llm,
    prompt=QUERY_PROMPT
)

# RAG prompt
template = """Answer the question based ONLY on the following context:
{context}
Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

In [15]:
retrieve_docs = retriever.invoke('Where is located the hazard flashers button?') # get relevant documents

OllamaEmbeddings: 100%|██████████| 1/1 [00:02<00:00,  2.31s/it]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 14.60it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.08it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.41it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 14.89it/s]


In [16]:
retrieve_docs

 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content="not in use (i.e., cellular devices, etc.). Eventually, if plugged in long enough without engine operation, the vehicle’s battery will discharge sufficiently to degrade battery life and/or prevent the engine from starting.\n\nREFUELING IN EMERGENCY\n\nThe vehicle is equipped with a refueling funnel for a capless fuel system. The refueling funnel is located under the passenger's seat along with the jack and\n\n294 IN CASE OF EMERGENCY\n\ntools. If refueling is necessary, while using an approved gas can, insert the refueling funnel into the filler neck opening. Take care to open both flappers with the funnel to avoid spills.\n\nFuel Funnel Location\n\nNOTE:"),
 Document(metadata={'source': '../pdf_files/owner_manual_p283-p300.pdf'}, page_content="Shift the transmission into PARK (P). Turn the ignition OFF.\n\noa FWD\n\nBlock both front and rear of the wheel diagonally opposite of the jacking position. 

In [17]:
len(retrieve_docs)

8

In [21]:
view_text_in_markdown(retrieve_docs[1].page_content)

Hazard Warning Flashers Button with 14.5-inch display

Push the button to turn on the Hazard Warning Flashers. When the button is activated, all directional turn signals will flash on and off to warn oncoming, traffic of an emergency. Push the button a second time ‘to turn off the Hazard Warning Flashers.

This is an emergency warning system and it should not be used when the vehicle is in motion. Use only when your vehicle is disabled or signaling a safety hazard warning for other motorists.

When leaving the vehicle to seek assistance, the

Hazard Warning Flashers will continue to operate even though the ignition is placed in the OFF position.

NOTE: With extended use the Hazard Warning Flashers may wear down your battery.

ASSIST AND SOS SYSTEM — IF EQUIPPED

‘A0703000080US

In [23]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

In [25]:
resposta = chain.invoke("Where is located the hazard flashers button?")

OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  7.27it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 15.47it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  5.61it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00, 14.09it/s]
OllamaEmbeddings: 100%|██████████| 1/1 [00:00<00:00,  8.18it/s]


In [26]:
view_text_in_markdown(resposta)

 The Hazard Warning Flashers button can be located on the upper switch bank just below the radio, above the 12-inch Uconnect display, or to the left of the 14.5-inch Uconnect display depending on the vehicle's equipment.

In [None]:
# Delete all collections in the db
vector_db.delete_collection()