<br>

<h1 style="text-align:center;">Medical Research using Llama3 RAG</h1>

<br>

## Introduction

---

In this project, we will use Llama 3 to implement a RAG project for question and answer functionality based on the source database.

In [17]:
# Import the libraries
import ollama
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter, SentenceTransformersTokenTextSplitter
from langchain_experimental.text_splitter import SemanticChunker
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import OllamaEmbeddings
from pprint import pprint

<br>

## Download Medical Datasets

---

Sources:
- https://www.moscmm.org/uploads/userfiles/Current%20Essentials%20of%20Medicine(1)(1).pdf

In [2]:
# Load the PDF files
loader = PyPDFLoader("datasets/cecil-textbook-of-medicine.pdf")

# Load the pages
pages = loader.load()

In [3]:
# TODO: DELETE: Sample
pages = pages[100:105]

In [4]:
# PDF info
print("Number of pages: ", len(pages))      

Number of pages:  5


In [5]:
# Retrive a page
page = pages[0]                      # First page

# Page info
print(page.page_content[0:500])      # Print first 500 characters
print(page.metadata)                 # Page metadata  

and hypomagnesemia. 
53
 Figure 16-3  Screening and brief intervention for alcohol problems in clinical practice.
Alcoholic hypoglycemia (see Chapter 243)  can be evaluated rapidly by a bedside blood glucose determination. If laboratory results are delayed, 12.5 to 25 g glucose 
should be given intravenously but must be preceded by or accompanied by 100 mg intravenous thiamine to avoid precipitating Wern icke's encephalopathy (see 
Chapter 489)  . Alcoholic ketoacidosis (see Chapter 102)  will b
{'source': 'datasets/cecil-textbook-of-medicine.pdf', 'page': 100}


<br>

## Data Preprocessing

---

In [6]:
# Initialize the text splitter
# text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=10)
text_splitter = SentenceTransformersTokenTextSplitter(chunk_size=500, chunk_overlap=10)
# text_splitter = SemanticChunker(embeddings)

  from tqdm.autonotebook import tqdm, trange


In [7]:
splits = text_splitter.split_documents(pages)

<br>

## Vector Database

---

In [8]:
# Create Ollama embeddings 
embeddings = OllamaEmbeddings(model="llama3")

In [9]:
# Index documents using Chroma and OllamaEmbeddings
vector_store = Chroma(embedding_function=embeddings)
vector_store.add_documents(splits)

['621fbf5a-5117-4fad-8a73-f540d9d1a16b',
 '4f364154-634b-4710-9456-058df17c5da6',
 '5c4c6c6a-6cb9-4d48-9387-669a78319752',
 'e7dce7f3-604a-4d34-a68a-0144d32adc28',
 'f7e44ca7-a4fd-4d7c-80a2-7f7ab67b12eb',
 '2314b326-d662-4559-9f49-4f12dc8156e3',
 '767211fc-4eba-4a86-8e9d-f2c186745a5e',
 '44a5f5b8-f6fa-4148-b625-7ac6356c1e40',
 '2459e1e7-ffe7-47f9-9fb3-753879c3f34e',
 '57eca68b-048c-4550-af99-020f7a3c7ea7',
 'b192344f-19e5-4ca6-bff5-989213438aec',
 '7a606bff-3e7c-4320-9fe9-fa5e4841f98f',
 'a53a7803-8825-4d2e-b7a6-ac3dd0837690',
 '5d9c30fd-8cb3-4b36-bb19-f28dd5f40307',
 'cd9ff211-9dae-4fe2-93d1-d37deb5718dc',
 'f5f04ced-c318-4785-9133-8c0fdc5fc7ba',
 'e9baa24f-77e7-4d36-bbaa-cc1018488afd',
 'ce66009f-30fa-4f98-a2f9-46bc3588dd4e',
 '76d239ed-3fcc-4897-a445-72bbccd28798',
 '33b89450-1ea6-4a8c-89d5-f011b3e8a058',
 '2beb4e4e-fc5a-4340-8f4a-82fdfac16b9b',
 '1ce2f587-f75f-46d7-89e1-700a9ebab268',
 'b50f91e9-d301-43af-ac53-eff69d0a653d']

In [10]:
# Create vector store
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)

In [11]:
vectorstore

<langchain_community.vectorstores.chroma.Chroma at 0x16290c8c8b0>

In [12]:
retriever = vectorstore.as_retriever()

<br>

## Ollama Llama-3 Model


---

In [27]:
# Define a function to handle the entire RAG chain process
def answer_question_with_context(question):

    # Retrieve documents relevant to the question
    retrieved_docs = retriever.invoke(question)
    
    # Combine the content of the retrieved documents into a single string
    formatted_context = "\n\n".join(doc.page_content for doc in retrieved_docs)
    
    # Format the prompt for the LLM
    formatted_prompt = f"Question: {question}\n\nContext: {formatted_context}"
    
    # Get the response from the LLM
    response = ollama.chat(model='llama3', messages=[{'role': 'user', 'content': formatted_prompt}])
    
    # Return the content of the response
    return response['message']['content']

In [28]:
# Sample prompt
prompt = "What's withdrawal symptoms of heroin?"

# Get an answer 
result = answer_question_with_context(prompt)

result

'Withdrawal symptoms of heroin:\n\n* Vital signs:\n\t+ Tachycardia\n\t+ Hypertension\n\t+ Fever\n* Central nervous system:\n\t+ Craving\n\t+ Restlessness\n\t+ Insomnia\n\t+ Muscle cramps\n\t+ Yawning\n\t+ Miosis (pinpoint pupils)\n* Eyes, nose, and lacrimation:\n\t+ Lacrimation (tearing)\n\t+ Rhinorrhea (runny nose)\n* Skin:\n\t+ Perspiration'

In [29]:
pprint(result.split("\n"))

['Withdrawal symptoms of heroin:',
 '',
 '* Vital signs:',
 '\t+ Tachycardia',
 '\t+ Hypertension',
 '\t+ Fever',
 '* Central nervous system:',
 '\t+ Craving',
 '\t+ Restlessness',
 '\t+ Insomnia',
 '\t+ Muscle cramps',
 '\t+ Yawning',
 '\t+ Miosis (pinpoint pupils)',
 '* Eyes, nose, and lacrimation:',
 '\t+ Lacrimation (tearing)',
 '\t+ Rhinorrhea (runny nose)',
 '* Skin:',
 '\t+ Perspiration']
