# Project Prototype: Agentic RAG System with Nursing Handbooks and Transes as Knowledge Base

The goal of this project is to create an **Agentic RAG-based system** that helps nursing students retrieve relevant information from nursing handbooks and their personal study notes. This system will augment the responses with context from the personal notes, making it more personalized and adaptive to the user's learning.

## Flow Overview

The project involves two primary stages:
1. **Prepopulating the Vector DB** (embedding the nursing handbooks and personal notes into a database)
2. **RAG Modeling** (retrieving relevant information and augmenting responses using both the handbooks and personal notes)

Additionally, there will be an **Agentic Layer** that intelligently routes queries to the appropriate source (nursing handbooks or personal study notes).

### Technologies to use

- **Docling**: For parsing the nursing handbooks and personal notes
- **OpenAI Embedding Model (large)**: For embedding both nursing handbooks and personal study notes.
- **ChromaDB**: For storing and querying the embeddings.
- **Deepseek LLM**: For augmenting responses based on the retrieved content.
- **Pydantic AI**: For AI agent intelligently routing queries between the nursing handbooks and personal notes. 
- **Pydantic**: For type safety and data validation.
- **FastAPI**: For building the API.
- **Docker**: For containerization and deployment.
- **Pydantic Graphs**: For workflow pipelines

## Initialization

In [70]:
import os
import joblib
import threading
import asyncio
import chromadb
import nanoid
import json
from pathlib import Path
import concurrent.futures
from rich import print
from docling.document_converter import DocumentConverter
from abc import ABC, abstractmethod
import chromadb.utils.embedding_functions as embedding_functions
from docling_core.transforms.chunker.hierarchical_chunker import DocChunk
from pydantic import BaseModel
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIModel
from docling.chunking import HybridChunker

from dotenv import load_dotenv

load_dotenv()
knowledge_base_raw_path = './data/knowledge-base/raw/'
knowledge_base_pickled_path = './data/knowledge-base/pickled/'

### Stage 1: Prepopulate the Vector DB

In this stage, we process nursing handbooks and personal study notes, embedding them into a vector database for efficient retrieval during RAG modeling.

#### Steps:

1. **Upload Nursing Handbooks**:
   - Upload nursing handbooks in popular formats like PDF or DOCX.
   - These documents may include textbooks on topics like **medical-surgical nursing, pharmacology, pediatric nursing**, and other specialized nursing areas.

2. **Parsing the documents and retaining the page metadata**:
   - Use **Docling** to convert nursing handbooks and personal notes docling document.
   - The conversion ensures that the pdf metadata are preserved, making it easier for the AI to process.

3. **Generate Embeddings**:
   - Use **OpenAI's large embedding model** to convert the nursing handbooks and personal notes into embeddings. These embeddings will capture the semantic meaning of each section, allowing for efficient similarity-based searches.
   - Both the nursing handbooks and the personal study notes will be embedded into the vector database.

4. **Save to Vector DB (ChromaDB)**:
   - Store the generated embeddings in **ChromaDB** for fast retrieval during RAG modeling.
   - The vector database will allow the system to quickly access the most relevant information when a query is made.

**Note**: Converting and embedding long documents like nursing handbooks may take some time (e.g., **15-30 minutes** per document).



In [33]:
# List the raw files to convert
os.listdir(knowledge_base_raw_path)

['Handbook of Clinical Nursing_ Medical-Surgical Nursing -- Joyce Fitzpatrick -- 2018 -- Springer Publishing Company -- 9780826130785 -- 26f2533f396508e653d45e0e76aadc53 -- Anna’s Archive.pdf',
 'NRG 304 LEC_ WEEK 2_CHAPTER 1 ASSESSMENT OF THE DIGESTIVE AND GASTROINTESTINAL FUNCTION.pdf']

In [34]:
document_converter = DocumentConverter()

knowledge_base_raw_files = os.listdir(knowledge_base_raw_path)
knowledge_base_pickled_files = os.listdir(knowledge_base_pickled_path)

for file in knowledge_base_raw_files:
	file_path = os.path.join(knowledge_base_raw_path, file)
	file_name = os.path.splitext(file)[0]

	# Check if not file 
	if not os.path.isfile(file_path):
		continue

	# Check if file is already pickled
	if f"{file_name}.docling" in knowledge_base_pickled_files:
		print(f"File `{file[:97]}`... is already pickled.")
		continue

	# Pickling the document to retain the metadata (there is currently no way to get the paging metadata from exported md)
	print(f"Pickling file: `{file[:97]}`...")
	conversion_result = document_converter.convert(file_path)
	
	joblib.dump(
		conversion_result.document, 
		os.path.join(knowledge_base_pickled_path, f"{file_name}.docling"),
		compress=3
	)

#### Create the collections

In [35]:
chroma_client = await chromadb.AsyncHttpClient(host="localhost", port="8001")

Using the OpenAI text-embedding-3-small model

In [36]:
embed_fn_openai = embedding_functions.OpenAIEmbeddingFunction(
	api_key=os.getenv("OPENAI_API_KEY"),
	model_name=os.getenv("OPENAI_EMBEDDING_MODEL"),
)

Creating the collections

In [37]:
collection_base_documents = await chroma_client.get_or_create_collection(name="base_documents", embedding_function=embed_fn_openai)

Chunking the document the NRG 304 document as test

In [38]:
document = joblib.load(os.path.join(
	knowledge_base_pickled_path, 
	"NRG 304 LEC_ WEEK 2_CHAPTER 1 ASSESSMENT OF THE DIGESTIVE AND GASTROINTESTINAL FUNCTION.docling"
))

### Stage 2: RAG Modeling

Once the vector database is populated, the system will retrieve relevant information from both nursing handbooks and personal study notes. The retrieval will be augmented using an LLM (Large Language Model) for more accurate and contextually rich responses.

#### Steps:

1. **Retrieve Relevant Information**:
   - When a query is input by the user (e.g., "What are the symptoms of diabetes?"), the system will retrieve the most relevant content from the vector database.
   - ChromaDB will find the sections of the nursing handbooks or personal study notes that are most similar to the query.

2. **Augment Response Using Deepseek LLM**:
   - The retrieved sections will be passed through the **Deepseek LLM**, which will generate a response based on the content of the handbooks and notes.
   - The LLM will combine the relevant information and format it into a coherent, accurate response tailored to the question.

In [39]:
chunker = HybridChunker(max_tokens=100)
chunk_iter = chunker.chunk(dl_doc=document)

for i, chunk in enumerate(chunk_iter):
	print(f"=== {i} ===")
	print(chunk.meta.export_json_dict())
	enriched_text = chunker.serialize(chunk=chunk)
	print(enriched_text)
	break

In [40]:
class Metadata(BaseModel):
	user_id: str
	filename: str
	heading: str
	page_no: int

class Records(BaseModel):
	documents: list[str] = []
	metadatas: list[Metadata] = []
	ids: list[str] = []

In [41]:
chunker = HybridChunker(max_tokens=1000)
chunk_iter = chunker.chunk(dl_doc=document)
document_chunks = list(chunk_iter)

records = Records()

# Lock for thread-safe appends
lock = threading.Lock()

def extract_and_store(chunk: DocChunk):
	enriched_text = chunker.serialize(chunk=chunk)
	chunk_id = nanoid.generate()
	doc_chunk_metadata = chunk.meta.export_json_dict()
	metadata = Metadata(
		user_id="jiya",
		filename=doc_chunk_metadata['origin']['filename'],
		heading=doc_chunk_metadata['headings'][0],
		page_no=doc_chunk_metadata['doc_items'][0]['prov'][0]['page_no']
	)

	with lock:
		records.documents.append(enriched_text)
		records.metadatas.append(metadata.model_dump())
		records.ids.append(chunk_id)

with concurrent.futures.ThreadPoolExecutor() as executor:
	executor.map(extract_and_store, document_chunks)

In [42]:
await collection_base_documents.add(
	documents=records.documents,
	metadatas=records.metadatas,
	ids=records.ids
)

In [104]:
query_result = await collection_base_documents.query(
    query_texts=["upper gastrointestinal tract study nursing interventions"],
    n_results=3,
    where={"user_id": "jiya"},
)

In [105]:
print(query_result)

In [106]:
documents = query_result["documents"][0]
metadatas = query_result["metadatas"][0]

for doc, metadata in zip(documents, metadatas):
	print(doc)
	print(f"Source: {metadata['filename']}, Page: {metadata['page_no']}")

#### Response augmenting with deepseek r1

In [107]:
class Reference(BaseModel):
	content: str
	page_number: int
	filename: str
	heading: str

In [108]:
query_result['documents']

[["Nursing Interventions :\n1. Clear  liquid  diet,  with  nothing  by  mouth  (NPO)  from midnight the night before the study.\n2. Patient is advised to not smoke or chew gum during the NPO period because these can increase gastric secretions and salivation.\n3. Polyethylene glycol (PEG)-based solutions are considered the most effective bowel cleansing preparatory agent.\n4. Oral  medications  are  withheld  on  the  morning  of  the study  and  resumed  that  evening,  but  each  patient's medication regimen should be evaluated on an individual basis.\n5. When a patient with insulin dependent diabetes is NPO, their insulin requirements will need  to be  adjusted accordingly.\n6. Instruct pt to increase OFI after the procedure  to facilitate evacuation of stool and barium.\n1. Low-residue diet 1 to 2 days before the test, a clear liquid diet and  a laxative  the  evening  before,  NPO  after midnight,  and  cleansing  enemas  until  returns  are  clear the following morning.\n2. Enema

In [109]:
references: list[Reference] = []
for metadata, document in zip(query_result["metadatas"][0], query_result["documents"][0]):
	references.append(
		Reference(
			content=document,
			page_number=metadata['page_no'],
			filename=metadata['filename'],
			heading=metadata['heading']
		)
	)

In [113]:
class AgentDependencies(BaseModel):
	references: list[Reference]

model = OpenAIModel(
	'google/gemini-2.0-flash-thinking-exp:free',
	base_url=os.getenv("OPEN_ROUTER_BASE_URL"),
	api_key=os.getenv("OPEN_ROUTER_API_KEY"),
)

agent = Agent(
	model=model,
	deps_type=AgentDependencies,
	result_type=str,
	system_prompt="""
		# Role
		You are a helpful medical assistant. 
		
		# Task
		You are tasked to answer a question based on a set of documents that you have been provided with and then explain based on what your know about the matter. 
		Please provide a detailed and accurate answer to the question.
		Always include the references.

	""",
)


@agent.system_prompt
async def add_references(
	ctx: RunContext[AgentDependencies],
) -> str:
	references = ctx.deps.references

	return f"""
		# References
		{'\n'.join([json.dumps(references[0].model_dump()) for reference in references])}

	"""

In [114]:
result = await agent.run(
	"can you explain me how the nursing interventions for upper gastrointestinal tract study are done",
	deps = AgentDependencies(
		references = references
	)
)

In [115]:
print(result.data)

### Stage 2: Agentic RAG System

The **Agentic Layer** adds an intelligent routing mechanism to the RAG system. Instead of simply retrieving data from a static source, the system will decide where to get the most relevant information based on the context of the query and whether the query pertains to the nursing handbooks or her personal study notes.

#### Steps:

0. **Incorporate Personal Study Notes (Transes)**:
   - Personal study notes (transes) are documents or annotations written by the user, capturing key insights, case studies, or specific experiences.
   - These notes will be embedded in the same way as the nursing handbooks and stored in the same vector database.

1. **Agent Routes the Query**:
   - When a user (nursing student) asks a question, the **agent** will decide whether to retrieve data from the nursing handbooks or her personal study notes.
   - The agent will consider:
     - The **context** of the question (general knowledge vs. personalized experiences).
     - Whether her **personal study notes** are more relevant (e.g., if she’s already written notes on a particular case or condition).
     - If the query is general, the agent will prioritize retrieving information from the **nursing handbooks**.

2. **Retrieve from Both Sources**:
   - The system retrieves information from both the **nursing handbooks** and **personal study notes** based on the agent’s decision.
   - If the question pertains to something she has written down in her personal notes, the agent will prioritize retrieving those, combining them with general knowledge from the handbooks.

3. **Augment Response Using Deepseek LLM**:
   - The retrieved content from both sources is fed to the **Deepseek LLM**, which will merge the insights and generate a complete response.
   - The LLM will ensure the answer is not only accurate but also personalized based on the notes, making it more useful and contextually rich.
