# HealthVest AI - Comprehensive Lab Report Analyzer

**MedGemma Impact Challenge Submission**

An AI-powered lab report analyzer that helps Indian patients understand their blood test results with **comprehensive health insights**.

## Features
- **AI-Powered Extraction**: Extract lab values from report images using MedGemma 1.5
- **Plain English Explanations**: Understand what each test means
- **Health Risk Score**: Overall health assessment based on all values
- **Pattern Detection**: Identify related conditions (diabetes, anemia, thyroid issues)
- **Visual Dashboard**: Charts showing where your values fall
- **Indian Diet Recommendations**: Culturally relevant food suggestions
- **Hindi Support**: Explanations in Hindi for wider accessibility
- **Comprehensive Report**: Downloadable health summary

## Problem We're Solving
In India, 1.3 billion people receive lab reports they can't understand. Medical jargon creates anxiety and prevents proactive health decisions. HealthVest AI bridges this gap using Google's MedGemma.

In [None]:
# Install dependencies
!pip install -q transformers>=4.50.0 accelerate pillow pdf2image
!pip install -q protobuf>=3.20
!pip install -q matplotlib plotly

import warnings
warnings.filterwarnings('ignore')

In [None]:
import torch
import json
from PIL import Image
from transformers import pipeline
import os
import matplotlib.pyplot as plt
import numpy as np
from IPython.display import HTML, display, Markdown
from datetime import datetime

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")

## Load MedGemma Model

Using MedGemma 1.5 4B - Google's open-source medical AI model.

In [None]:
# Model configuration - MedGemma 1.5 4B (latest version)
MODEL_ID = "google/medgemma-1.5-4b-it"

# Get HF token from Kaggle secrets
# IMPORTANT: In Kaggle, you must:
# 1. Add secret: Settings (right panel) > Secrets > Add Secret > Name: "HF_TOKEN"
# 2. ATTACH the secret: Toggle ON next to your HF_TOKEN secret
# 3. Accept model license at: https://huggingface.co/google/medgemma-1.5-4b-it

HF_TOKEN = None

# Method 1: Kaggle Secrets (preferred)
try:
    from kaggle_secrets import UserSecretsClient
    user_secrets = UserSecretsClient()
    HF_TOKEN = user_secrets.get_secret("HF_TOKEN")
    print("HF_TOKEN loaded from Kaggle Secrets")
except Exception as e:
    print(f"Kaggle secrets error: {e}")

# Method 2: Environment variable fallback
if not HF_TOKEN:
    HF_TOKEN = os.environ.get('HF_TOKEN', None)
    if HF_TOKEN:
        print("HF_TOKEN loaded from environment")

# Method 3: Direct input (for testing only)
if not HF_TOKEN:
    print("ERROR: HF_TOKEN not found!")
    print("\nTo fix this on Kaggle:")
    print("1. Right panel > Secrets > Add Secret")
    print("2. Name: HF_TOKEN, Value: your_token")
    print("3. TOGGLE ON the secret to attach it to notebook")
    print("4. Accept license: https://huggingface.co/google/medgemma-1.5-4b-it")
else:
    # Verify token works
    print(f"Token starts with: {HF_TOKEN[:10]}...")

In [None]:
# Load MedGemma using pipeline (recommended approach)
print("Loading MedGemma model (this takes 2-3 minutes on GPU)...")

pipe = pipeline(
    "image-text-to-text",
    model=MODEL_ID,
    token=HF_TOKEN,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

print("MedGemma loaded successfully!")

## Extraction Prompt

Carefully crafted prompt for extracting lab values from Indian lab report formats.

In [None]:
# ==================== PROMPTS ====================

EXTRACTION_PROMPT = """You are a medical lab report analyzer. Extract all test values from this lab report image.

For each test, provide:
- test_name: Name of the test (e.g., "Hemoglobin", "Fasting Blood Sugar", "TSH")
- value: Numeric value as shown
- unit: Unit of measurement (e.g., "g/dL", "mg/dL", "mIU/L")
- reference_range: Normal range as shown on report
- status: "normal", "high", or "low" based on reference range

Return ONLY a JSON array. Example:
[
  {"test_name": "Hemoglobin", "value": 14.2, "unit": "g/dL", "reference_range": "13.0-17.0", "status": "normal"}
]

Extract ALL tests visible. Use exact values. Handle Indian lab formats (Thyrocare, SRL, Dr. Lal PathLabs).
"""

EXPLANATION_PROMPT = """You are a friendly medical educator. Explain this lab value simply:

Test: {test_name}
Value: {value} {unit}
Normal Range: {reference_range}
Status: {status}

In under 80 words, explain:
1. What this test measures
2. What your result means
3. One actionable tip (if needed)

Use simple language. Never diagnose - suggest discussing with doctor if abnormal.
"""

HINDI_EXPLANATION_PROMPT = """‡§Ü‡§™ ‡§è‡§ï ‡§Æ‡§ø‡§§‡•ç‡§∞‡§µ‡§§ ‡§ö‡§ø‡§ï‡§ø‡§§‡•ç‡§∏‡§æ ‡§∂‡§ø‡§ï‡•ç‡§∑‡§ï ‡§π‡•à‡§Ç‡•§ ‡§á‡§∏ ‡§≤‡•à‡§¨ ‡§µ‡•à‡§≤‡•ç‡§Ø‡•Ç ‡§ï‡•ã ‡§∏‡§∞‡§≤ ‡§π‡§ø‡§Ç‡§¶‡•Ä ‡§Æ‡•á‡§Ç ‡§∏‡§Æ‡§ù‡§æ‡§è‡§Ç:

‡§ü‡•á‡§∏‡•ç‡§ü: {test_name}
‡§µ‡•à‡§≤‡•ç‡§Ø‡•Ç: {value} {unit}
‡§∏‡§æ‡§Æ‡§æ‡§®‡•ç‡§Ø ‡§∞‡•á‡§Ç‡§ú: {reference_range}
‡§∏‡•ç‡§•‡§ø‡§§‡§ø: {status}

80 ‡§∂‡§¨‡•ç‡§¶‡•ã‡§Ç ‡§Æ‡•á‡§Ç ‡§¨‡§§‡§æ‡§è‡§Ç:
1. ‡§Ø‡§π ‡§ü‡•á‡§∏‡•ç‡§ü ‡§ï‡•ç‡§Ø‡§æ ‡§Æ‡§æ‡§™‡§§‡§æ ‡§π‡•à
2. ‡§Ü‡§™‡§ï‡•á ‡§™‡§∞‡§ø‡§£‡§æ‡§Æ ‡§ï‡§æ ‡§ï‡•ç‡§Ø‡§æ ‡§Æ‡§§‡§≤‡§¨ ‡§π‡•à
3. ‡§è‡§ï ‡§∏‡§≤‡§æ‡§π (‡§Ø‡§¶‡§ø ‡§ú‡§∞‡•Ç‡§∞‡•Ä ‡§π‡•ã)

‡§∏‡§∞‡§≤ ‡§≠‡§æ‡§∑‡§æ ‡§Æ‡•á‡§Ç ‡§∏‡§Æ‡§ù‡§æ‡§è‡§Ç‡•§ ‡§ï‡§≠‡•Ä ‡§®‡§ø‡§¶‡§æ‡§® ‡§® ‡§ï‡§∞‡•á‡§Ç - ‡§Ö‡§∏‡§æ‡§Æ‡§æ‡§®‡•ç‡§Ø ‡§π‡•ã‡§®‡•á ‡§™‡§∞ ‡§°‡•â‡§ï‡•ç‡§ü‡§∞ ‡§∏‡•á ‡§Æ‡§ø‡§≤‡§®‡•á ‡§ï‡•Ä ‡§∏‡§≤‡§æ‡§π ‡§¶‡•á‡§Ç‡•§
"""

INDIAN_DIET_PROMPT = """Based on this lab result, suggest Indian diet recommendations:

Test: {test_name}
Value: {value} {unit}
Status: {status}

Provide 3-4 specific Indian food recommendations that can help. Include:
- Common Indian foods (dal, sabzi, fruits available in India)
- Home remedies if applicable
- Foods to avoid

Keep it practical for an Indian household. Be specific (e.g., "palak dal" not just "leafy greens").
Format as a short bullet list.
"""

HEALTH_SUMMARY_PROMPT = """Analyze these lab results and provide a comprehensive health summary:

Lab Results:
{lab_results}

Provide:
1. OVERALL HEALTH SCORE (0-100) with brief justification
2. KEY FINDINGS (most important observations)
3. POTENTIAL HEALTH PATTERNS (e.g., signs of diabetes, anemia, thyroid issues)
4. PRIORITY ACTIONS (what to address first)
5. LIFESTYLE RECOMMENDATIONS

Be thorough but concise. This is for patient education, not diagnosis.
"""

CORRELATION_PROMPT = """Analyze these lab values for medical correlations:

{lab_results}

Identify:
1. Related abnormalities that suggest a pattern (e.g., low Hb + low MCV + low iron = iron deficiency anemia)
2. Values that affect each other
3. Potential underlying conditions these patterns suggest
4. Which specialist to consult if needed

Be specific about the correlations. Format clearly.
"""

## Core Functions

In [None]:
# ==================== CORE FUNCTIONS ====================

def query_medgemma(prompt: str, image: Image.Image = None) -> str:
    """Query MedGemma with text or image+text."""
    if image:
        messages = [{"role": "user", "content": [
            {"type": "image", "image": image},
            {"type": "text", "text": prompt}
        ]}]
    else:
        messages = [{"role": "user", "content": [
            {"type": "text", "text": prompt}
        ]}]
    
    output = pipe(messages, max_new_tokens=1024)
    return output[0]["generated_text"][-1]["content"]


def extract_lab_values(image: Image.Image) -> list:
    """Extract lab values from a lab report image using MedGemma."""
    max_size = 1024
    if max(image.size) > max_size:
        ratio = max_size / max(image.size)
        new_size = (int(image.size[0] * ratio), int(image.size[1] * ratio))
        image = image.resize(new_size, Image.Resampling.LANCZOS)
    
    response = query_medgemma(EXTRACTION_PROMPT, image)
    
    try:
        start = response.find('[')
        end = response.rfind(']') + 1
        if start != -1 and end > start:
            return json.loads(response[start:end])
    except json.JSONDecodeError as e:
        print(f"JSON parsing error: {e}")
    return []


def explain_lab_value(test_name: str, value: float, unit: str, 
                      reference_range: str, status: str, language: str = "english") -> str:
    """Generate explanation in English or Hindi."""
    prompt_template = HINDI_EXPLANATION_PROMPT if language == "hindi" else EXPLANATION_PROMPT
    prompt = prompt_template.format(
        test_name=test_name, value=value, unit=unit,
        reference_range=reference_range, status=status
    )
    return query_medgemma(prompt)


def get_indian_diet_tips(test_name: str, value: float, unit: str, status: str) -> str:
    """Get Indian diet recommendations for a lab value."""
    prompt = INDIAN_DIET_PROMPT.format(
        test_name=test_name, value=value, unit=unit, status=status
    )
    return query_medgemma(prompt)


def calculate_health_score(results: list) -> dict:
    """Calculate overall health risk score based on lab values."""
    if not results:
        return {"score": 0, "category": "Unknown", "color": "#gray"}
    
    total = len(results)
    normal = sum(1 for r in results if r.get('status') == 'normal')
    high = sum(1 for r in results if r.get('status') == 'high')
    low = sum(1 for r in results if r.get('status') == 'low')
    
    # Base score
    score = (normal / total) * 100
    
    # Penalty for abnormal values
    score -= (high * 5)  # -5 for each high value
    score -= (low * 5)   # -5 for each low value
    
    score = max(0, min(100, score))  # Clamp between 0-100
    
    if score >= 80:
        category, color = "Excellent", "#28a745"
    elif score >= 60:
        category, color = "Good", "#17a2b8"
    elif score >= 40:
        category, color = "Fair", "#ffc107"
    else:
        category, color = "Needs Attention", "#dc3545"
    
    return {"score": round(score), "category": category, "color": color,
            "normal": normal, "high": high, "low": low, "total": total}


def detect_health_patterns(results: list) -> list:
    """Detect common health patterns from lab values."""
    patterns = []
    test_names = {r.get('test_name', '').lower(): r for r in results}
    
    # Diabetes indicators
    diabetes_tests = ['fasting blood sugar', 'fbs', 'glucose', 'hba1c', 'pp blood sugar']
    diabetes_high = any(test_names.get(t, {}).get('status') == 'high' for t in diabetes_tests)
    if diabetes_high:
        patterns.append({
            "name": "Diabetes Risk",
            "icon": "ü©∏",
            "description": "Elevated blood sugar levels detected. Monitor carbohydrate intake.",
            "severity": "high",
            "specialist": "Diabetologist/Endocrinologist"
        })
    
    # Anemia indicators
    anemia_tests = ['hemoglobin', 'hb', 'rbc', 'mcv', 'iron', 'ferritin']
    anemia_low = any(test_names.get(t, {}).get('status') == 'low' for t in anemia_tests)
    if anemia_low:
        patterns.append({
            "name": "Anemia Indicators",
            "icon": "üî¥",
            "description": "Low blood cell indicators. May cause fatigue and weakness.",
            "severity": "medium",
            "specialist": "Hematologist"
        })
    
    # Thyroid issues
    thyroid_tests = ['tsh', 't3', 't4', 'free t3', 'free t4']
    thyroid_abnormal = any(test_names.get(t, {}).get('status') in ['high', 'low'] for t in thyroid_tests)
    if thyroid_abnormal:
        patterns.append({
            "name": "Thyroid Imbalance",
            "icon": "ü¶ã",
            "description": "Thyroid hormone levels outside normal range.",
            "severity": "medium",
            "specialist": "Endocrinologist"
        })
    
    # Lipid/Cholesterol issues
    lipid_tests = ['cholesterol', 'ldl', 'hdl', 'triglycerides', 'vldl']
    lipid_high = any(test_names.get(t, {}).get('status') == 'high' for t in lipid_tests)
    if lipid_high:
        patterns.append({
            "name": "Cardiovascular Risk",
            "icon": "‚ù§Ô∏è",
            "description": "Elevated cholesterol/lipid levels increase heart disease risk.",
            "severity": "high",
            "specialist": "Cardiologist"
        })
    
    # Kidney function
    kidney_tests = ['creatinine', 'urea', 'bun', 'uric acid', 'egfr']
    kidney_abnormal = any(test_names.get(t, {}).get('status') in ['high', 'low'] for t in kidney_tests)
    if kidney_abnormal:
        patterns.append({
            "name": "Kidney Function",
            "icon": "ü´ò",
            "description": "Kidney markers outside normal range. Stay hydrated.",
            "severity": "medium",
            "specialist": "Nephrologist"
        })
    
    # Liver function
    liver_tests = ['sgpt', 'sgot', 'alt', 'ast', 'bilirubin', 'albumin']
    liver_abnormal = any(test_names.get(t, {}).get('status') in ['high', 'low'] for t in liver_tests)
    if liver_abnormal:
        patterns.append({
            "name": "Liver Function",
            "icon": "ü´Å",
            "description": "Liver enzyme levels need attention.",
            "severity": "medium",
            "specialist": "Hepatologist/Gastroenterologist"
        })
    
    return patterns


def get_comprehensive_analysis(results: list) -> str:
    """Get MedGemma's comprehensive analysis of all results."""
    lab_summary = "\n".join([
        f"- {r['test_name']}: {r['value']} {r['unit']} ({r['status'].upper()})"
        for r in results
    ])
    prompt = HEALTH_SUMMARY_PROMPT.format(lab_results=lab_summary)
    return query_medgemma(prompt)


def get_correlations(results: list) -> str:
    """Get medical correlations between lab values."""
    lab_summary = "\n".join([
        f"- {r['test_name']}: {r['value']} {r['unit']} ({r['status'].upper()})"
        for r in results
    ])
    prompt = CORRELATION_PROMPT.format(lab_results=lab_summary)
    return query_medgemma(prompt)

In [None]:
# ==================== VISUALIZATION DASHBOARD ====================

def create_health_dashboard(results: list, health_score: dict, patterns: list):
    """Create a visual health dashboard."""
    
    fig, axes = plt.subplots(2, 2, figsize=(14, 10))
    fig.suptitle('HealthVest AI - Lab Report Dashboard', fontsize=16, fontweight='bold')
    
    # 1. Health Score Gauge (top-left)
    ax1 = axes[0, 0]
    score = health_score['score']
    colors = ['#dc3545', '#ffc107', '#17a2b8', '#28a745']
    ax1.pie([score, 100-score], colors=[health_score['color'], '#e9ecef'],
            startangle=90, counterclock=False)
    circle = plt.Circle((0, 0), 0.7, color='white')
    ax1.add_patch(circle)
    ax1.text(0, 0, f"{score}", fontsize=36, ha='center', va='center', fontweight='bold')
    ax1.text(0, -0.25, health_score['category'], fontsize=12, ha='center', va='center')
    ax1.set_title('Health Score', fontsize=12, fontweight='bold')
    
    # 2. Test Status Distribution (top-right)
    ax2 = axes[0, 1]
    status_counts = [health_score['normal'], health_score['high'], health_score['low']]
    status_labels = ['Normal', 'High', 'Low']
    status_colors = ['#28a745', '#dc3545', '#ffc107']
    bars = ax2.bar(status_labels, status_counts, color=status_colors)
    ax2.set_ylabel('Number of Tests')
    ax2.set_title('Test Results Overview', fontsize=12, fontweight='bold')
    for bar, count in zip(bars, status_counts):
        ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.1,
                str(count), ha='center', va='bottom', fontweight='bold')
    
    # 3. Individual Test Values (bottom-left)
    ax3 = axes[1, 0]
    test_names = [r['test_name'][:15] for r in results[:8]]  # Limit to 8 tests
    statuses = [r['status'] for r in results[:8]]
    colors = ['#28a745' if s == 'normal' else '#dc3545' if s == 'high' else '#ffc107' for s in statuses]
    y_pos = np.arange(len(test_names))
    ax3.barh(y_pos, [1]*len(test_names), color=colors)
    ax3.set_yticks(y_pos)
    ax3.set_yticklabels(test_names)
    ax3.set_xlim(0, 1.5)
    ax3.set_xticks([])
    for i, (status, r) in enumerate(zip(statuses, results[:8])):
        ax3.text(1.1, i, f"{r['value']} {r['unit']}", va='center', fontsize=9)
    ax3.set_title('Test Values', fontsize=12, fontweight='bold')
    
    # 4. Detected Patterns (bottom-right)
    ax4 = axes[1, 1]
    ax4.axis('off')
    if patterns:
        pattern_text = "DETECTED HEALTH PATTERNS:\n\n"
        for p in patterns:
            severity_color = "üî¥" if p['severity'] == 'high' else "üü°"
            pattern_text += f"{p['icon']} {p['name']} {severity_color}\n"
            pattern_text += f"   {p['description']}\n"
            pattern_text += f"   Consult: {p['specialist']}\n\n"
    else:
        pattern_text = "‚úÖ No concerning patterns detected!\n\nAll your results look good."
    ax4.text(0.1, 0.9, pattern_text, transform=ax4.transAxes, fontsize=10,
             verticalalignment='top', fontfamily='monospace',
             bbox=dict(boxstyle='round', facecolor='#f8f9fa', edgecolor='#dee2e6'))
    ax4.set_title('Health Patterns', fontsize=12, fontweight='bold')
    
    plt.tight_layout()
    plt.savefig('health_dashboard.png', dpi=150, bbox_inches='tight')
    plt.show()
    print("Dashboard saved as 'health_dashboard.png'")


def display_comprehensive_results(results: list, health_score: dict, patterns: list,
                                   show_hindi: bool = False, show_diet: bool = True):
    """Display comprehensive results with all features."""
    
    # Health Score Card
    html = f"""
    <div style='font-family: Arial, sans-serif; max-width: 900px; margin: 0 auto;'>
        <div style='background: linear-gradient(135deg, {health_score['color']}22, {health_score['color']}44);
                    border: 2px solid {health_score['color']}; border-radius: 15px; padding: 20px;
                    text-align: center; margin-bottom: 20px;'>
            <h2 style='margin: 0; color: #333;'>Your Health Score</h2>
            <div style='font-size: 72px; font-weight: bold; color: {health_score['color']};'>{health_score['score']}</div>
            <div style='font-size: 24px; color: {health_score['color']};'>{health_score['category']}</div>
            <p style='color: #666; margin-top: 10px;'>
                {health_score['normal']} Normal | {health_score['high']} High | {health_score['low']} Low
            </p>
        </div>
    """
    
    # Detected Patterns
    if patterns:
        html += "<div style='background: #fff3cd; border: 1px solid #ffc107; border-radius: 10px; padding: 15px; margin-bottom: 20px;'>"
        html += "<h3 style='margin-top: 0; color: #856404;'>‚ö†Ô∏è Detected Health Patterns</h3>"
        for p in patterns:
            severity_badge = "üî¥ High Priority" if p['severity'] == 'high' else "üü° Medium Priority"
            html += f"""
            <div style='background: white; border-radius: 8px; padding: 12px; margin: 10px 0;'>
                <strong>{p['icon']} {p['name']}</strong> <span style='font-size: 12px;'>{severity_badge}</span>
                <p style='margin: 5px 0; color: #666;'>{p['description']}</p>
                <p style='margin: 0; color: #17a2b8; font-size: 13px;'>üë®‚Äç‚öïÔ∏è Recommended: {p['specialist']}</p>
            </div>
            """
        html += "</div>"
    
    # Individual Results
    html += "<h3>üìã Detailed Test Results</h3>"
    
    for r in results:
        status = r.get('status', 'normal')
        color = '#28a745' if status == 'normal' else '#dc3545' if status == 'high' else '#ffc107'
        badge = '‚úì Normal' if status == 'normal' else '‚Üë High' if status == 'high' else '‚Üì Low'
        
        html += f"""
        <div style='border: 1px solid #ddd; border-left: 4px solid {color}; 
                    padding: 15px; margin: 10px 0; border-radius: 4px; background: white;'>
            <div style='display: flex; justify-content: space-between; align-items: center;'>
                <h4 style='margin: 0; color: #333;'>{r.get('test_name', 'Unknown')}</h4>
                <span style='background: {color}; color: white; padding: 4px 12px; 
                             border-radius: 20px; font-size: 12px;'>{badge}</span>
            </div>
            <p style='font-size: 28px; margin: 10px 0; color: #333;'>
                <strong>{r.get('value', 'N/A')}</strong> 
                <span style='font-size: 14px; color: #666;'>{r.get('unit', '')}</span>
            </p>
            <p style='color: #666; font-size: 13px;'>Reference: {r.get('reference_range', 'N/A')}</p>
            <hr style='border: none; border-top: 1px solid #eee;'>
            <p style='color: #444;'><strong>üìñ Explanation:</strong> {r.get('explanation', '')}</p>
        """
        
        if show_hindi and r.get('explanation_hindi'):
            html += f"<p style='color: #444;'><strong>üáÆüá≥ ‡§π‡§ø‡§Ç‡§¶‡•Ä:</strong> {r.get('explanation_hindi', '')}</p>"
        
        if show_diet and r.get('diet_tips') and status != 'normal':
            html += f"<p style='color: #28a745;'><strong>ü•ó Diet Tips:</strong> {r.get('diet_tips', '')}</p>"
        
        html += "</div>"
    
    html += "</div>"
    display(HTML(html))

In [ ]:
# ==================== COMPREHENSIVE DEMO ====================
# Showcasing ALL features with sample Indian patient data

# Sample lab report data (typical Indian patient with multiple concerns)
sample_results = [
    {"test_name": "Hemoglobin", "value": 10.5, "unit": "g/dL", "reference_range": "13.0-17.0", "status": "low"},
    {"test_name": "Fasting Blood Sugar", "value": 142, "unit": "mg/dL", "reference_range": "70-100", "status": "high"},
    {"test_name": "HbA1c", "value": 7.2, "unit": "%", "reference_range": "4.0-5.6", "status": "high"},
    {"test_name": "Total Cholesterol", "value": 245, "unit": "mg/dL", "reference_range": "< 200", "status": "high"},
    {"test_name": "TSH", "value": 2.5, "unit": "mIU/L", "reference_range": "0.4-4.0", "status": "normal"},
    {"test_name": "Creatinine", "value": 1.0, "unit": "mg/dL", "reference_range": "0.7-1.3", "status": "normal"},
    {"test_name": "SGPT (ALT)", "value": 35, "unit": "U/L", "reference_range": "7-56", "status": "normal"},
    {"test_name": "Vitamin D", "value": 18, "unit": "ng/mL", "reference_range": "30-100", "status": "low"},
]

print("=" * 70)
print("    HEALTHVEST AI - COMPREHENSIVE LAB REPORT ANALYSIS")
print("    Powered by Google MedGemma 1.5")
print("=" * 70)

# Step 1: Calculate Health Score
print("\nüìä Calculating Health Score...")
health_score = calculate_health_score(sample_results)
print(f"   Score: {health_score['score']}/100 - {health_score['category']}")

# Step 2: Detect Health Patterns
print("\nüîç Analyzing Health Patterns...")
patterns = detect_health_patterns(sample_results)
for p in patterns:
    print(f"   {p['icon']} {p['name']} - {p['severity'].upper()} priority")

# Step 3: Generate Explanations (English + Hindi + Diet Tips)
print("\nüìù Generating Personalized Explanations...")
for i, test in enumerate(sample_results):
    print(f"   Processing {i+1}/{len(sample_results)}: {test['test_name']}...")
    
    # English explanation
    test['explanation'] = explain_lab_value(
        test['test_name'], test['value'], test['unit'],
        test['reference_range'], test['status'], language="english"
    )
    
    # Hindi explanation (for abnormal values)
    if test['status'] != 'normal':
        test['explanation_hindi'] = explain_lab_value(
            test['test_name'], test['value'], test['unit'],
            test['reference_range'], test['status'], language="hindi"
        )
        
        # Indian diet tips (for abnormal values)
        test['diet_tips'] = get_indian_diet_tips(
            test['test_name'], test['value'], test['unit'], test['status']
        )

print("\n‚úÖ Analysis Complete!")

In [None]:
# ==================== VISUAL DASHBOARD ====================
print("Creating Visual Health Dashboard...")
create_health_dashboard(sample_results, health_score, patterns)

In [None]:
# ==================== COMPREHENSIVE RESULTS DISPLAY ====================
# Show all results with explanations, Hindi translations, and diet tips
display_comprehensive_results(sample_results, health_score, patterns, 
                              show_hindi=True, show_diet=True)

In [None]:
# ==================== AI-POWERED COMPREHENSIVE ANALYSIS ====================
print("ü§ñ MedGemma Comprehensive Health Analysis")
print("=" * 60)

comprehensive_analysis = get_comprehensive_analysis(sample_results)
print(comprehensive_analysis)

## Impact & Innovation

### What Makes HealthVest AI Different

| Feature | Traditional Apps | HealthVest AI |
|---------|-----------------|---------------|
| Language | English only | English + Hindi |
| Context | Generic advice | Indian-specific diet tips |
| Analysis | Single values | Pattern detection across values |
| Insights | Basic ranges | AI-powered correlations |
| Output | Text only | Visual dashboard + detailed report |

### Technical Innovation with MedGemma

1. **Multimodal Intelligence**: Extract data from lab report images using vision capabilities
2. **Medical Reasoning**: Identify correlations between different lab values
3. **Culturally Aware**: Indian food recommendations (dal, sabzi, local fruits)
4. **Bilingual Output**: Hindi explanations for 500M+ Hindi speakers
5. **Pattern Recognition**: Detect diabetes, anemia, thyroid issues automatically

### Real-World Impact for India

- **1.3 billion people** can understand their lab reports
- **500 million Hindi speakers** get explanations in their language
- **Rural patients** get specialist-level insights without city visits
- **Preventive care** through early pattern detection

### Future Scope

- WhatsApp integration for rural India
- Voice explanations in regional languages
- Integration with ABDM (Ayushman Bharat Digital Mission)
- Trend tracking across multiple reports

In [None]:
# Demo: Explain sample lab values without needing an image
sample_tests = [
    {"test_name": "Hemoglobin", "value": 11.5, "unit": "g/dL", "reference_range": "13.0-17.0", "status": "low"},
    {"test_name": "Fasting Blood Sugar", "value": 126, "unit": "mg/dL", "reference_range": "70-100", "status": "high"},
    {"test_name": "TSH", "value": 2.5, "unit": "mIU/L", "reference_range": "0.4-4.0", "status": "normal"},
]

print("Demo: Generating explanations for sample lab values\n")
print("="*60)

for test in sample_tests:
    print(f"\n{test['test_name']}: {test['value']} {test['unit']} ({test['status'].upper()})")
    print("-"*40)
    
    explanation = explain_lab_value(
        test['test_name'],
        test['value'],
        test['unit'],
        test['reference_range'],
        test['status']
    )
    print(explanation)
    print()

In [None]:
# Option 1: Upload a file using Kaggle's file browser
# Click "Add Input" in the right panel > Upload > Select your lab report image/PDF

# Option 2: Use a sample from Kaggle datasets
# from kaggle_datasets import KaggleDatasets

# List uploaded files
import glob
uploaded_files = glob.glob('/kaggle/input/**/*.*', recursive=True)
print("Available input files:")
for f in uploaded_files[:10]:
    print(f"  {f}")

# Load your lab report image
# Change this path to your uploaded file
IMAGE_PATH = "/kaggle/input/your-lab-report.jpg"  # Update this path

if os.path.exists(IMAGE_PATH):
    image = Image.open(IMAGE_PATH).convert('RGB')
    print(f"Loaded image: {IMAGE_PATH}")
    print(f"Image size: {image.size}")
else:
    print(f"File not found: {IMAGE_PATH}")
    print("Upload a lab report using 'Add Input' in the right panel")

In [None]:
# Run analysis
results = analyze_report(image)

print("\n" + "="*60)
print("ANALYSIS COMPLETE")
print("="*60)
print(f"Total tests: {results['total_tests']}")
print(f"Normal: {results['normal']}")
print(f"Abnormal: {results['abnormal']}")

In [None]:
# Display results with nice formatting
from IPython.display import HTML, display

def display_results(results):
    """Display analysis results with nice HTML formatting."""
    html = "<div style='font-family: Arial, sans-serif;'>"
    
    for r in results['results']:
        status = r.get('status', 'normal')
        color = '#28a745' if status == 'normal' else '#dc3545' if status == 'high' else '#ffc107'
        badge = 'Normal' if status == 'normal' else 'High' if status == 'high' else 'Low'
        
        html += f"""
        <div style='border: 1px solid #ddd; border-left: 4px solid {color}; 
                    padding: 15px; margin: 10px 0; border-radius: 4px;'>
            <div style='display: flex; justify-content: space-between; align-items: center;'>
                <h3 style='margin: 0; color: #333;'>{r.get('test_name', 'Unknown')}</h3>
                <span style='background: {color}; color: white; padding: 4px 12px; 
                             border-radius: 20px; font-size: 12px;'>{badge}</span>
            </div>
            <p style='font-size: 24px; margin: 10px 0; color: #333;'>
                <strong>{r.get('value', 'N/A')}</strong> 
                <span style='font-size: 14px; color: #666;'>{r.get('unit', '')}</span>
            </p>
            <p style='color: #666; font-size: 13px; margin: 5px 0;'>
                Reference: {r.get('reference_range', 'N/A')}
            </p>
            <hr style='border: none; border-top: 1px solid #eee; margin: 10px 0;'>
            <p style='color: #444; line-height: 1.5;'>{r.get('explanation', 'No explanation available.')}</p>
        </div>
        """
    
    html += "</div>"
    display(HTML(html))

# Display results if available
if 'results' in dir() and results:
    display_results(results)

## Impact & Summary

### Problem We're Solving
In India, millions of patients receive lab reports filled with medical jargon, confusing reference ranges, and numbers that mean nothing to them. This creates anxiety and prevents patients from taking proactive steps to improve their health.

### How MedGemma Helps
MedGemma 1.5 enables us to:
1. **Extract** structured data from lab report images (OCR + understanding)
2. **Interpret** values by comparing to reference ranges
3. **Explain** results in simple, actionable language

### Real-World Impact
- **Accessibility**: Patients can understand their own health data
- **Empowerment**: Informed patients make better health decisions
- **Healthcare efficiency**: Doctors spend less time explaining basics
- **Early intervention**: Patients notice abnormalities sooner

### Technical Highlights
- Uses MedGemma 1.5 4B instruction-tuned model
- Handles multimodal input (image + text)
- Trained on medical knowledge for accurate health information
- Generates patient-friendly explanations

### Future Roadmap
- Mobile app for instant report scanning
- Trend tracking across multiple reports
- Regional language support (Hindi, Tamil, etc.)
- Integration with hospital systems

In [None]:
# Save results to JSON
with open('analysis_results.json', 'w') as f:
    json.dump(results, f, indent=2)
print("Results saved to analysis_results.json")