In [1]:
!pip install openai==0.28.1
!pip install tiktoken==0.6.0
!pip install langchain==0.1.20
!pip install chromadb==0.5.0
!pip install faiss-cpu

Collecting openai==0.28.1
  Downloading openai-0.28.1-py3-none-any.whl.metadata (11 kB)
Downloading openai-0.28.1-py3-none-any.whl (76 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.0/77.0 kB[0m [31m1.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.52.2
    Uninstalling openai-1.52.2:
      Successfully uninstalled openai-1.52.2
Successfully installed openai-0.28.1
Collecting tiktoken==0.6.0
  Downloading tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Downloading tiktoken-0.6.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m13.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: tiktoken
Successfully installed tiktoken-0.6.0
Collecting langchain==0.1.20
  Downloading langchain-0.1.20-py3-none-any.whl.m

In [2]:
import openai
import numpy as np
import pandas as pd
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.embeddings import OpenAIEmbeddings
from langchain.prompts import ChatPromptTemplate
from langchain.vectorstores import Chroma
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnablePassthrough
from openai.embeddings_utils import get_embedding
import faiss
import warnings
import os
warnings.filterwarnings("ignore")

In [3]:
API_KEY = 'YOUR API KEY HERE'

In [4]:
openai.api_key = API_KEY

In [5]:
SYSTEM_PROMPT = '''
You are an AI chatbot designed to assist patients and healthcare professionals by providing medical information sourced from a RAG (retrieval-augmented generation) database. Your responses should follow the RICE framework:

Role:
You are a virtual assistant, trained to provide reliable, compassionate, and up-to-date medical information to both patients and healthcare professionals. Your responses should be based on the RAG database, ensuring factual accuracy and relevance.

Instructions:
Respond with empathy, especially when patients discuss symptoms or medical conditions.
Offer precise, evidence-based information, using language appropriate to the user’s background (simple for patients, technical for healthcare professionals).
Always reference data from the RAG database to ensure accuracy and reliability.
If the information is unavailable, clearly indicate the limitation and recommend consulting a healthcare provider for more tailored advice.
Prioritize clarity and conciseness, aiming to give users the most relevant information efficiently.
Context:
You will be interacting with both patients and healthcare professionals.
For patients, your responses should focus on general medical knowledge and guidance, explaining terms simply and empathetically.
For healthcare professionals, you should provide detailed, evidence-based, technical information aligned with medical guidelines and best practices.
Constraints:
Do not offer a diagnosis or treatment plan.
Always base responses on verified, trusted information from the RAG database.
Do not speculate on medical issues; if the information is unavailable, suggest the user consult a medical professional.
Ensure a respectful and professional tone at all times.
Responses should be concise and focused on answering the user's primary concern.
Examples:
For Patients:
User: I’ve been having headaches every day. Should I see a doctor?
Chatbot: I’m sorry to hear that you’ve been dealing with daily headaches. While many headaches are not serious, frequent headaches could be a sign of an underlying issue. I recommend you speak with a healthcare provider to discuss your symptoms and get an accurate diagnosis. It’s always best to be proactive about health concerns.

For Healthcare Professionals:
User: Can you provide the latest guidelines on managing diabetes in elderly patients?
Chatbot: The latest guidelines from the RAG database emphasize personalized treatment plans for elderly patients with diabetes. Key recommendations include adjusting medication based on renal function, addressing polypharmacy, and ensuring regular monitoring of blood glucose levels. Be sure to tailor interventions to each patient's specific health needs and comorbidities.
'''

In [6]:
prompt = "Hi give me a summary of Jane Does EHR and see if theres anything wrong with it"

struc = [{"role": "system", "content": SYSTEM_PROMPT}]

## Converting the rows to embeddings and adding them to the ['combined'] column

In [7]:
df = pd.read_csv('https://raw.githubusercontent.com/asocastro/AI_First_Day_4_Datasets/refs/heads/main/patient_ehr.csv')
df['combined'] = df.apply(lambda row : ' '.join(row.values.astype(str)), axis = 1)

docs = df['combined'].tolist()
embeddings = [get_embedding(doc, engine = "text-embedding-3-small") for doc in docs]
embedding_dim = len(embeddings[0])
embeddings_np = np.array(embeddings).astype('float32')

index = faiss.IndexFlatL2(embedding_dim)
index.add(embeddings_np)

query_embed = get_embedding(prompt, engine='text-embedding-3-small')
query_embed_np = np.array([query_embed]).astype('float32')
_, indices = index.search(query_embed_np, 2)
retrieved_docs = [docs[i] for i in indices[0]]
context = ' '.join(retrieved_docs)

structured_prompt = f"Context:\n{context}\n\nQuery:\n{prompt}\n\nResponse:"

In [8]:
df.head()

Unnamed: 0,Patient ID,Name,Age,Gender,Date of Birth,Height (cm),Weight (kg),BMI,Blood Pressure (mmHg),Heart Rate (bpm),...,Last Visit,Smoking Status,Alcohol Intake,Physical Activity Level,Emergency Contact,Facility ID,Facility Name,Nurse ID,Nurse Name,combined
0,PAT201240,John Doe,45,M,1979-03-15,178,80,25.2,120/80,72,...,2024-10-01,Former,Occasional,Moderate,Mary Doe (555-1234),FAC001,Altercare Newark South,NUR100,Maria Johnson,PAT201240 John Doe 45 M 1979-03-15 178 80 25.2...
1,PAT201241,Jane Smith,50,F,1974-07-10,165,65,23.9,130/85,78,...,2024-09-20,Non-Smoker,Never,Low,Robert Smith (555-5678),FAC002,Altercare Majora Lane,NUR101,David Lee,PAT201241 Jane Smith 50 F 1974-07-10 165 65 23...
2,PAT201242,Alice Brown,35,F,1989-11-02,170,68,23.5,118/76,65,...,2024-10-15,Non-Smoker,Moderate,High,Michael Brown (555-8765),FAC003,Altercare Lanfair,NUR102,Susan Green,PAT201242 Alice Brown 35 F 1989-11-02 170 68 2...
3,PAT201243,Mark White,60,M,1964-05-23,175,85,27.8,140/90,75,...,2024-08-30,Former,Occasional,Moderate,Sarah White (555-4321),FAC004,Altercare Newark North,NUR103,Karen Smith,PAT201243 Mark White 60 M 1964-05-23 175 85 27...
4,PAT201244,Laura Green,29,F,1995-01-10,160,54,21.1,115/75,70,...,2024-09-05,Non-Smoker,Never,Moderate,Emma Green (555-3456),FAC005,Country Lawn Center for Rehab,NUR104,James Brown,PAT201244 Laura Green 29 F 1995-01-10 160 54 2...


In [9]:
print(structured_prompt)

Context:
PAT201240 John Doe 45 M 1979-03-15 178 80 25.2 120/80 72 98 36.6 110 Hypertension Penicillin Atenolol 2024-10-01 Former Occasional Moderate Mary Doe (555-1234) FAC001 Altercare Newark South NUR100 Maria Johnson PAT201241 Jane Smith 50 F 1974-07-10 165 65 23.9 130/85 78 96 37.0 105 Diabetes Type 2 nan Metformin 2024-09-20 Non-Smoker Never Low Robert Smith (555-5678) FAC002 Altercare Majora Lane NUR101 David Lee

Query:
Hi give me a summary of Jane Does EHR and see if theres anything wrong with it

Response:


In [10]:
chat =  openai.ChatCompletion.create(model = "gpt-4o-mini", messages = struc + [{"role": "user", "content" : structured_prompt}], temperature=0.5, max_tokens=1500, top_p=1, frequency_penalty=0, presence_penalty=0)
struc.append({"role": "user", "content": prompt})
response = chat.choices[0].message.content

In [11]:
print(response)

Jane Smith is a 50-year-old female with a date of birth on July 10, 1974. Here’s a summary of her electronic health record (EHR):

- **Height:** 165 cm
- **Weight:** 65 kg
- **BMI:** 23.9 (within the normal range)
- **Blood Pressure:** 130/85 mmHg (elevated, but not hypertensive)
- **Heart Rate:** 78 bpm (normal range)
- **Temperature:** 37.0 °C (normal)
- **Blood Glucose Level:** 105 mg/dL (slightly elevated, may indicate prediabetes)
- **Medical Condition:** Type 2 Diabetes
- **Medication:** Metformin (commonly prescribed for managing Type 2 diabetes)
- **Smoking Status:** Non-Smoker
- **Alcohol Use:** Never
- **Next Appointment:** Scheduled for September 20, 2024
- **Emergency Contact:** Robert Smith (555-5678)
- **Facility:** Altercare Majora Lane
- **Nurse:** David Lee

**Assessment:**
While Jane's BMI is normal and her blood pressure is slightly elevated but not classified as hypertension, her blood glucose level of 105 mg/dL may indicate prediabetes, which warrants monitoring an