In [None]:
#Step 1: Define the problem clearly
#Patients receive lab test results (blood tests, urine tests, imaging reports, etc.).
#These results often contain medical jargon and numbers that aren’t easy to understand.
#Goal: Translate results into plain, human-friendly explanations (without replacing a doctor’s judgement).

Step 2: Think about inputs and outputs
#Input options:
#Digital copy → patient uploads a PDF, text, or scans it.
#Physical copy → patient takes a photo of the paper result → OCR extracts the text.

#Output:
#A plain language summary explaining:
#What each test is
#What the result means
#Whether it’s within normal range
#Whether follow-up with a doctor is important

In [1]:
##importing all the dependencies we need for our llm to work 
import os 
import openai

from dotenv import load_dotenv
load_dotenv("file.env")
openai.api_key = os.environ['OPENAI_API_KEY']

In [2]:
#importing all the langchain dependencies we will be using in our code 

from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate 
from langchain.chains import LLMChain

There was a problem when trying to write in your cache folder (C:\Users\ayodelefalajiki/.cache\huggingface\hub). You should set the environment variable TRANSFORMERS_CACHE to a writable directory.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


In [8]:
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

prompt = PromptTemplate(
    input_variables=["lab_results"],
    template="""
You are a medical assistant helping patients understand their lab test results.

For each lab test in the results, provide:
1. Test name
2. Result and normal range
3. Whether it is normal, low, or high
4. A short plain language explanation

At the end, add:
⚠️ Disclaimer: This explanation is for understanding purposes only and not medical advice.

Lab Results:
{lab_results}

Respond in a clear bullet-point format.

"""
)

In [9]:
chain = prompt | llm

lab_text = """
- Hemoglobin: 10.2 g/dL (Normal: 13.5 - 17.5)
- WBC: 12,500 /µL (Normal: 4,500 - 11,000)
- Cholesterol: 240 mg/dL (Normal: <200)
"""

response = chain.invoke({"lab_results":lab_text})
print(response.content)

- **Test Name:** Hemoglobin  
  - **Result and Normal Range:** 10.2 g/dL (Normal: 13.5 - 17.5)  
  - **Status:** Low  
  - **Explanation:** Hemoglobin is a protein in red blood cells that carries oxygen. A low level may indicate anemia, which means your body might not be getting enough oxygen.

- **Test Name:** WBC (White Blood Cell Count)  
  - **Result and Normal Range:** 12,500 /µL (Normal: 4,500 - 11,000)  
  - **Status:** High  
  - **Explanation:** White blood cells help fight infections. A high count can suggest that your body is responding to an infection, inflammation, or other medical conditions.

- **Test Name:** Cholesterol  
  - **Result and Normal Range:** 240 mg/dL (Normal: <200)  
  - **Status:** High  
  - **Explanation:** Cholesterol is a fatty substance in your blood. A high level can increase the risk of heart disease and stroke, indicating that you may need to make lifestyle changes or consider medication.

⚠️ Disclaimer: This explanation is for understanding purpo

In [10]:
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter


In [15]:
file = r"C:\Users\ayodelefalajiki\Desktop\pathology_lab_result.pdf"
loader = PyPDFLoader(file)
documents = loader.load()
print("📄 Pages loaded:", len(documents))
print("First page content preview:\n", documents[0].page_content[:600])

📄 Pages loaded: 19
First page content preview:
 Hemoglobin g/dL 13.0 - 16.5Colorimetric 14.5
RBC Count million/cmm 4.5 - 5.5Electrical impedance4.79
Hematocrit % 40 - 49Calculated 43.3
MCV fL 83 - 101Derived 90.3
MCH pg 27.1 - 32.5Calculated 30.2
MCHC g/dL 32.5 - 36.7Calculated 33.4
RDW CV % 11.6 - 14Calculated 13.60
Total WBC and Differential Count
WBC Count H /cmm 4000 - 10000SF Cube cell analysis10570
Neutrophils % 40 - 80Microscopic 73
Lymphocytes % 20 - 40Microscopic 19
Eosinophils % 1 - 6Microscopic 02
Monocytes % 2 - 10Microscopic 06
Basophils % 0 - 2Microscopic 00
/cmm 2000 - 67007716
/cmm 1100 - 33002008
/cmm 00 - 400211
/cmm 200 -


In [16]:
text_splitter = RecursiveCharacterTextSplitter(
              chunk_size = 800,
              chunk_overlap= 50
)
chunks = text_splitter.split_documents(documents)

print("Number of Chunks:", len(chunks))
print("Chunk sample:\n", chunks[0].page_content)

Number of Chunks: 55
Chunk sample:
 Hemoglobin g/dL 13.0 - 16.5Colorimetric 14.5
RBC Count million/cmm 4.5 - 5.5Electrical impedance4.79
Hematocrit % 40 - 49Calculated 43.3
MCV fL 83 - 101Derived 90.3
MCH pg 27.1 - 32.5Calculated 30.2
MCHC g/dL 32.5 - 36.7Calculated 33.4
RDW CV % 11.6 - 14Calculated 13.60
Total WBC and Differential Count
WBC Count H /cmm 4000 - 10000SF Cube cell analysis10570
Neutrophils % 40 - 80Microscopic 73
Lymphocytes % 20 - 40Microscopic 19
Eosinophils % 1 - 6Microscopic 02
Monocytes % 2 - 10Microscopic 06
Basophils % 0 - 2Microscopic 00
/cmm 2000 - 67007716
/cmm 1100 - 33002008
/cmm 00 - 400211
/cmm 200 - 700634
/cmm 0 - 1000
Platelet Count /cmm 150000 - 410000Electrical impedance 150000
MPV H fL 7.5 - 10.3Calculated 14.00
Peripheral Smear Examination
RBC Morphology Normochromic Normocytic


In [18]:
pdf_text = chunks[0].page_content

result = chain.invoke({"lab_results": pdf_text})
print(result.content)

### Lab Test Results

- **Test Name:** Hemoglobin  
  - **Result:** 14.5 g/dL  
  - **Normal Range:** 13.0 - 16.5 g/dL  
  - **Status:** Normal  
  - **Explanation:** Hemoglobin is the protein in red blood cells that carries oxygen. Your level is within the normal range, indicating good oxygen transport in your blood.

- **Test Name:** RBC Count  
  - **Result:** 4.79 million/cmm  
  - **Normal Range:** 4.5 - 5.5 million/cmm  
  - **Status:** Normal  
  - **Explanation:** This measures the number of red blood cells. Your count is normal, suggesting adequate red blood cell production.

- **Test Name:** Hematocrit  
  - **Result:** 43.3%  
  - **Normal Range:** 40 - 49%  
  - **Status:** Normal  
  - **Explanation:** Hematocrit indicates the proportion of blood volume that is made up of red blood cells. Your result is normal, reflecting a healthy blood composition.

- **Test Name:** MCV (Mean Corpuscular Volume)  
  - **Result:** 90.3 fL  
  - **Normal Range:** 83 - 101 fL  
  - **Status