# Lesson 4: Q&A over Documents

### **1. Components of the Retrieval Q&A Chain**

#### **1.1 Document Store**
The document store is where all your documents are stored and indexed for retrieval. Popular options include:

- **FAISS (Facebook AI Similarity Search):** For vector-based similarity search.
- **Pinecone:** A scalable vector database for high-performance retrieval.
- **Weaviate or Chroma:** Modern alternatives with feature-rich capabilities.

Here, we're using **DocArrayInMemorySearch**, as it is suitable for small scale applications like this one, whereas the above DBs are more suited for large-scale applications

The document store allows for the efficient retrieval of documents based on vector similarity.

#### **1.2 Embedding Model**
The embedding model converts documents and user queries into dense vector representations. These embeddings capture semantic meaning and are essential for similarity searches

#### **1.3 Retriever**
The retriever is responsible for searching the document store and returning the most relevant documents based on the query embedding. Two main types of retrieval methods are used:

- **Similarity-based retrieval:** Finds documents closest to the query in vector space.
- **Hybrid retrieval:** Combines traditional keyword search with vector similarity.

#### **1.4 Large Language Model (LLM)**
The LLM interprets the retrieved documents and generates an accurate and contextually appropriate answer to the user’s query.

#### **1.5 Chain Logic**
Chains in LangChain enable the combination of multiple components into a coherent pipeline. For Retrieval Q&A, the chain typically involves:

- Embedding the query.
- Retrieving relevant documents.
- Answer generation using the LLM.


And again, Andrew Ng's lesson used deprecated classes, so here I use the latest ones, as suggested by LangChain: https://python.langchain.com/docs/versions/migrating_chains/retrieval_qa/

In [18]:
from dotenv import load_dotenv

load_dotenv()

True

In [3]:
from langchain_core.output_parsers import StrOutputParser
from langchain_ollama import ChatOllama, OllamaLLM, OllamaEmbeddings
from langchain.document_loaders import PyPDFLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain import hub

### 1. Creating Document Store

#### 1.1 Loading Document

In [4]:
loader = PyPDFLoader(
    file_path="SuFIA.pdf",
    extract_images=True,
    )

pages = [page for page in loader.lazy_load()]

print(len(pages)) #the pdf has 8 pages, and this prints 8
print(pages[0])

8
page_content='SUFIA: Language-Guided Augmented Dexterity
for Robotic Surgical Assistants
Masoud Moghani1, Lars Doorenbos 2, William Chung-Ho Panitch 3,
Sean Huver4, Mahdi Azizian 4, Ken Goldberg 3, Animesh Garg 1,4,5
Abstract— In this work, we present SUFIA , the first frame-
work for natural language-guided augmented dexterity for
robotic surgical assistants. SUFIA incorporates the strong
reasoning capabilities of large language models (LLMs) with
perception modules to implement high-level planning and low-
level control of a robot for surgical sub-task execution. This
enables a learning-free approach to surgical augmented dexterity
without any in-context examples or motion primitives. SUFIA
uses a human-in-the-loop paradigm by restoring control to
the surgeon in the case of insufficient information, mitigating
unexpected errors for mission-critical tasks. We evaluate SUFIA
on four surgical sub-tasks in a simulation environment and two
sub-tasks on a physical surgical robotic platfo

In [5]:
print(type(pages[0]))

<class 'langchain_core.documents.base.Document'>


In [6]:
print(len(pages[0].page_content))

4681


#### 1.2 Configuring the DB with document and embedding 

In [8]:
embeddings = OllamaEmbeddings(model="llama3.2")

db = DocArrayInMemorySearch.from_documents(pages, embeddings)




In [9]:
llm = ChatOllama(model="llama3.2")  

### 2. Creating the QnA chain

`db.as_retriever()` wraps the vector db into a retriever

In [20]:
retrieval_qa_chat_prompt = hub.pull("langchain-ai/retrieval-qa-chat")

combine_docs_chain = create_stuff_documents_chain(llm, retrieval_qa_chat_prompt)
rag_chain = create_retrieval_chain(db.as_retriever(), combine_docs_chain)

response = rag_chain.invoke({"input": "What are autonomous agents?"})



In [22]:
print(response["answer"])

Autonomous agents, also known as self-driving or intelligent agents, refer to software systems that can perceive their environment, make decisions, and take actions without human intervention. These agents use artificial intelligence (AI) and machine learning (ML) algorithms to learn from data and adapt to new situations.

Autonomous agents are designed to operate in a wide range of applications, including:

1. Robotics: Autonomous robots that can navigate and interact with their environment.
2. Autonomous vehicles: Self-driving cars, drones, and trucks that can operate without human intervention.
3. Smart homes: Intelligent systems that can control lighting, temperature, security, and entertainment systems.
4. Healthcare: Personalized medicine, disease diagnosis, and treatment planning.
5. Finance: Algorithmic trading, risk management, and portfolio optimization.

Characteristics of autonomous agents:

1. Autonomy: They operate independently, making decisions without human oversight.


In [28]:
response["context"]

[Document(metadata={'source': 'SuFIA.pdf', 'page': 5}, page_content='(N1) (N2) (N3)\n(N4) (N5)\nFig. 5: Needle variations in simulation. We consider five instances\nof simulated suture needles (N1 - N5) with various sizes and shapes\nto conduct the generalizability experiment in O RBIT -Surgical.\nNeedle Handover – "Pick up the needle with the arm\nclosest to it, move it directly to the handover location between\nthe two arms, and keep holding the needle. Grasp the right\nside of the needle with the other robot arm, then right after\nthat, release the needle from the first robot and stay put."\nVessel Dilation – "Grasp the vessel from its leftmost\nside with robot 0 and pull it backward to the left by 5\nmillimeters while holding on to it to dilate. When grasping\nthe vessel, grasp it 15 millimeters below the left point."\nShunt Insertion – "Lift the small shunt from the middle\nand insert it into the left opening of the large tube. Approach\nthe large tube from the left. Only lift the