# üè• Pharma-Safe Lens - Kaggle Validation Notebook

**Complete validation for all phases**

## Setup Instructions
1. **Phase 1-2**: CPU only (no GPU needed)
2. **Phase 3+**: Enable GPU accelerator (T4 x2 or P100)

## Important: Run cells in order!

## Cell 0: Install System Dependencies (REQUIRED)

Tesseract OCR must be installed before Python packages.

In [None]:
# Install Tesseract OCR engine
!apt-get update -y
!apt-get install -y tesseract-ocr

# Verify
!tesseract --version

## Cell 1: Clone Repository

In [None]:
# Clone from GitHub (Replace YOUR_USERNAME)
!git clone https://github.com/AdtiyaLingam/pharma-safe-lens.git
%cd pharma-safe-lens

## Cell 2: Install Python Dependencies

In [None]:
%cd backend
!pip install -r requirements.txt
!pip install transformers accelerate bitsandbytes

## Cell 3: Verify Imports

In [None]:
import sys
sys.path.insert(0, '/kaggle/working/pharma-safe-lens')

# Test imports
import easyocr
import pytesseract
import cv2
from backend.app.drug_db import DrugDatabase
from backend.app.ocr import extract_text
from backend.app.interaction_logic import InteractionChecker
from backend.app.prompts import PromptTemplates

print("‚úÖ All imports successful!")

## Phase 1 & 2 Validation: Logic Core (CPU)

In [None]:
# 1. Initialize Modules
db = DrugDatabase()
checker = InteractionChecker()

print(f"Loaded {len(db.drug_map)} drugs")
print(f"Loaded {len(checker.interactions)} interactions")

# 2. Test Drug Normalization
raw_input = ['ECOSPRIN 75', 'WARFARIN 5MG']
normalized_drugs = db.normalize(raw_input)
print(f"\nInput: {raw_input} -> Normalized: {normalized_drugs}")

# 3. Test Interaction Logic
interactions = checker.check_multiple(normalized_drugs)
for i in interactions:
    print(f"\n‚ö†Ô∏è RISK FOUND: {i['risk_level'].upper()}")
    print(f"Reason: {i['clinical_effect']}")
    
# 4. Test Prompt Generation
if interactions:
    prompt = PromptTemplates.format_explanation_prompt(interactions[0])
    print(f"\ngenerated Prompt Preview:\n{prompt[:200]}...")

## Phase 3 Validation: MedGemma Reasoning (GPU REQUIRED)

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Check GPU
if not torch.cuda.is_available():
    raise RuntimeError("‚ùå GPU not detected! Enable Accelerator in Kaggle settings.")
    
print(f"‚úÖ GPU Detected: {torch.cuda.get_device_name(0)}")

In [None]:
# Load Model (MedGemma-2b or similar open medical LLM)
# We will use 'google/gemma-2b-it' or a biomed-tuned variant if available
# For this demo, let's use standard gemma-2b-it as placeholder for MedGemma access
MODEL_ID = "google/gemma-2b-it"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto"
)

print("‚úÖ Model Loaded Successfully")

In [None]:
# Full Inference Pipeline

def generate_explanation(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    outputs = model.generate(**inputs, max_new_tokens=200)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Run on our detected interaction
if interactions:
    print("üß† Generating Explanation for: Aspirin + Warfarin...")
    explanation = generate_explanation(prompt)
    
    print("\n" + "="*40)
    print("MEDGEMMA OUTPUT:")
    print("="*40)
    print(explanation)