<a href="https://colab.research.google.com/github/UdaraChamidu/Eye-Disease-Classification-With-Integrated-Chatbot/blob/main/LLava_Med_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

🔍 Project Summary (Vision-Language Image Analysis)
You are analyzing unlabeled OCT/fundus eye images using two models:

✅ 1. InceptionV3 (Keras)
Extracts 2048-dim feature vectors from each image.

Useful for clustering, similarity search or model training.

✅ 2. LLaVA-Med (llava-med-v1.5-mistral-7b)
Generates medical captions from eye images.

Example: “Possible macular edema and hemorrhages seen”

✅ 3. Dataset Setup
My dataset is stored in subfolders per disease (e.g., Glaucoma/, AMD/, etc.)

Each image is processed and saved with:

Filename

Folder name (pseudo-label)

InceptionV3 features

LLaVA-Med caption

In [11]:
#!pip uninstall -y transformers accelerate
!pip install git+https://github.com/huggingface/transformers.git
!pip install accelerate bitsandbytes
!pip install transformers accelerate bitsandbytes



Collecting git+https://github.com/huggingface/transformers.git
  Cloning https://github.com/huggingface/transformers.git to /tmp/pip-req-build-xhkvlov5
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-req-build-xhkvlov5
  Resolved https://github.com/huggingface/transformers.git to commit 12b612830dc76a3a91d3fe1486b1bcb77b6ac4c4
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


# Load Dataset

In [12]:
from google.colab import drive
import os

# Mount Google Drive
drive.mount('/content/drive')

# Path to your dataset root folder (with disease folders inside)
root_folder = '/content/drive/MyDrive/dataset/Dataset'

# Recursively collect all image paths and folder (label) names
image_paths = []
image_labels = []

for folder_name in os.listdir(root_folder):
    folder_path = os.path.join(root_folder, folder_name)
    if os.path.isdir(folder_path):
        for filename in os.listdir(folder_path):
            if filename.lower().endswith(('.jpg', '.jpeg', '.png')):
                image_paths.append(os.path.join(folder_path, filename))
                image_labels.append(folder_name)

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# Load InceptionV3 Model & Extract Features

In [13]:
from tensorflow.keras.applications.inception_v3 import InceptionV3, preprocess_input
from tensorflow.keras.preprocessing import image
import numpy as np
from tensorflow.keras.models import Model
from PIL import Image

# Load pre-trained model without top layer
base_model = InceptionV3(weights='imagenet', include_top=False, pooling='avg')  # outputs 2048-d vector

def extract_features(img_path):
    img = image.load_img(img_path, target_size=(299, 299))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    return base_model.predict(x)[0]  # 2048 vector

# Install & Load LLaVA-Med from Hugging Face

In [20]:
# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("microsoft/llava-med-v1.5-mistral-7b", torch_dtype="auto"),

ValueError: The checkpoint you are trying to load has model type `llava_mistral` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`

# Generate Medical Caption from LLaVA-Med

In [None]:
def generate_caption(img_path):
    img = Image.open(img_path).convert("RGB")
    prompt = "Describe any abnormalities in this retinal image."

    inputs = processor(prompt, images=img, return_tensors="pt").to(model.device)
    output_ids = model.generate(**inputs, max_new_tokens=100)
    return processor.batch_decode(output_ids, skip_special_tokens=True)[0]

# Combine Results for Each Image

In [None]:
results = []

for path in image_paths:
    features = extract_features(path)  # From InceptionV3
    caption = generate_caption(path)   # From LLaVA-Med

    results.append({
        "filename": os.path.basename(path),
        "inception_features": features.tolist(),
        "llava_caption": caption
    })

# Save the model

In [None]:
import json

with open("image_results.json", "w") as f:
    json.dump(results, f, indent=2)

# Send Caption + Symptoms to Gemini/RAG Chatbot

In [None]:
caption = results[0]["llava_caption"]
symptom = "Patient complains of blurred vision and eye pain"

query = f"Image finding: {caption}. Patient symptom: {symptom}. What is the most likely diagnosis?"