# üè• Pharma-Safe Lens - Kaggle Validation Notebook

**Complete validation for all phases**

## Setup Instructions
1. **Phase 1-2**: CPU only (no GPU needed)
2. **Phase 3+**: Enable GPU accelerator (T4 x2 or P100)

## Important: Run cells in order!

## Cell 0: Install System Dependencies (REQUIRED)

Tesseract OCR must be installed before Python packages.

In [None]:
# Install Tesseract OCR engine
!apt-get update -y
!apt-get install -y tesseract-ocr

# Verify
!tesseract --version

## Cell 1: Clone Repository

In [None]:
# Clone from GitHub (Replace YOUR_USERNAME)
!git clone https://github.com/AdtiyaLingam/pharma-safe-lens.git
%cd pharma-safe-lens

## Cell 2: Install Python Dependencies

In [None]:
%cd backend
!pip install -r requirements.txt
!pip install transformers accelerate bitsandbytes

## Cell 3: Verify Imports

In [None]:
import sys
sys.path.insert(0, '/kaggle/working/pharma-safe-lens')

# Test imports
import easyocr
import pytesseract
import cv2
from backend.app.drug_db import DrugDatabase
from backend.app.ocr import extract_text
from backend.app.interaction_logic import InteractionChecker
from backend.app.prompts import PromptTemplates

# New in Phase 4
from backend.app.safety import SafetyGuard

print("‚úÖ All imports successful!")

## Phase 1 & 2 Validation: Logic Core (CPU)

In [None]:
# 1. Initialize Modules
db = DrugDatabase()
checker = InteractionChecker()

print(f"Loaded {len(db.drug_map)} drugs")
print(f"Loaded {len(checker.interactions)} interactions")

# 2. Test Drug Normalization
raw_input = ['ECOSPRIN 75', 'WARFARIN 5MG']
normalized_drugs = db.normalize(raw_input)
print(f"\nInput: {raw_input} -> Normalized: {normalized_drugs}")

# 3. Test Interaction Logic
interactions = checker.check_multiple(normalized_drugs)
for i in interactions:
    print(f"\n‚ö†Ô∏è RISK FOUND: {i['risk_level'].upper()}")
    print(f"Reason: {i['clinical_effect']}")

## Phase 3 Validation: MedGemma Reasoning (GPU REQUIRED)

In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Check GPU
if not torch.cuda.is_available():
    raise RuntimeError("‚ùå GPU not detected! Enable Accelerator in Kaggle settings.")
    
print(f"‚úÖ GPU Detected: {torch.cuda.get_device_name(0)}")

In [None]:
# Load Model (Recommend google/gemma-2b-it or 4b-it)
MODEL_ID = "google/gemma-2b-it"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID)
model = AutoModelForCausalLM.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.float16,
    device_map="auto"
)

print(f"‚úÖ Model {MODEL_ID} Loaded Successfully")

In [None]:
# Full Inference Pipeline with Chat Templates

def generate_with_chat_template(user_prompt):
    # Create chat message structure
    messages = [
        {"role": "user", "content": user_prompt}
    ]
    
    # Apply chat template
    input_ids = tokenizer.apply_chat_template(
        messages, 
        add_generation_prompt=True, 
        return_tensors="pt"
    ).to("cuda")
    
    # Generate response
    outputs = model.generate(
        input_ids, 
        max_new_tokens=256,
        do_sample=True, 
        temperature=0.7,
        top_p=0.9
    )
    
    # Decode only new tokens
    response = outputs[0][input_ids.shape[-1]:]
    return tokenizer.decode(response, skip_special_tokens=True)

# 1. Generate Explanation
explanation = ""
if interactions:
    print("üß† Generating Explanation for: Aspirin + Warfarin...")
    
    prompt_content = PromptTemplates.format_explanation_prompt(interactions[0])
    explanation = generate_with_chat_template(prompt_content)
    
    print("\n" + "="*40)
    print("MEDGEMMA OUTPUT (Raw):")
    print("="*40)
    print(explanation)

## Phase 4 Validation: Safety & Localization

In [None]:
# 1. Run Safety Guard
print("üõ°Ô∏è Running Safety Check...")
is_safe, safe_explanation = SafetyGuard.validate_output(explanation)

if is_safe:
    print("‚úÖ Safety Check Passed.")
else:
    print("‚ùå Safety Violation Detected!")
    print(f"Warning: {safe_explanation}")

# 2. Translate to Hindi (Localization)
if is_safe:
    print("\nüåê Generating Hindi Translation...")
    
    trans_prompt = PromptTemplates.format_translation_prompt(safe_explanation, "Hindi")
    hindi_explanation = generate_with_chat_template(trans_prompt)
    
    print("\n" + "="*40)
    print("HINDI TRANSLATION:")
    print("="*40)
    print(hindi_explanation)