# üè• DermaCheck AI - Kaggle Deployment
## MedGemma Local Inference on T4 GPU

**HAI-DEF Compliant** | **Zero Cost** | **Production Ready**

---

### Prerequisites Checklist
- ‚úÖ HuggingFace access to `google/medgemma-4b-it` (approved)
- ‚úÖ Kaggle Secrets configured (TELEGRAM_BOT_TOKEN, HF_TOKEN, NGROK_TOKEN)
- ‚úÖ Kaggle Settings: GPU T4 + Internet ON

### Expected Performance
- Model load: ~2-3 minutes
- Photo analysis: ~15-20 seconds
- Text consultation: ~8-10 seconds
- GPU memory: ~12-14GB / 16GB

## üì¶ Step 1: Install Dependencies

In [None]:
%%time
print("üì¶ Installing dependencies...\n")

# Install all required packages
!pip install -q python-telegram-bot==21.0.0
!pip install -q transformers>=4.40.0
!pip install -q torch>=2.1.0 torchvision>=0.16.0
!pip install -q bitsandbytes>=0.43.0 accelerate>=0.27.0
!pip install -q sentencepiece protobuf
!pip install -q pyngrok python-dotenv Pillow

print("\n‚úÖ All dependencies installed!")
print("üìä Package versions:")
!pip list | grep -E "transformers|torch|telegram|bitsandbytes|accelerate"

## üîê Step 2: Setup Secrets & Environment

In [None]:
from kaggle_secrets import UserSecretsClient
import os

print("üîê Loading Kaggle secrets...\n")

# Initialize secrets client
user_secrets = UserSecretsClient()

# Load secrets
try:
    os.environ['TELEGRAM_BOT_TOKEN'] = user_secrets.get_secret("TELEGRAM_BOT_TOKEN")
    os.environ['HF_TOKEN'] = user_secrets.get_secret("HF_TOKEN")
    os.environ['NGROK_TOKEN'] = user_secrets.get_secret("NGROK_TOKEN")
    
    print("‚úÖ All secrets loaded successfully!\n")
    print(f"üì± Telegram token: {os.environ['TELEGRAM_BOT_TOKEN'][:20]}...")
    print(f"ü§ó HF token: {os.environ['HF_TOKEN'][:20]}...")
    print(f"üåê ngrok token: {os.environ['NGROK_TOKEN'][:20]}...")
    
except Exception as e:
    print(f"‚ùå Failed to load secrets: {e}")
    print("\nüìù Add secrets in Kaggle:")
    print("   1. Click 'Add-ons' ‚Üí 'Secrets'")
    print("   2. Add: TELEGRAM_BOT_TOKEN, HF_TOKEN, NGROK_TOKEN")
    print("   3. Enable 'Notebook access'")
    raise

## üìÇ Step 3: Clone Repository

In [None]:
%%time
import os

# Clone DermaCheck AI repository
REPO_URL = "https://github.com/YOUR_USERNAME/dermacheck-ai.git"  # ‚Üê UPDATE THIS!

print(f"üìÇ Cloning repository: {REPO_URL}\n")

# Remove if exists
!rm -rf dermacheck-ai

# Clone
!git clone {REPO_URL}

# Change directory
%cd dermacheck-ai

print(f"\n‚úÖ Repository cloned successfully!")
print(f"üìç Working directory: {os.getcwd()}")
print("\nüìÅ Files:")
!ls -lh

## ü§ó Step 4: Authenticate HuggingFace

In [None]:
from huggingface_hub import login

print("ü§ó Authenticating with HuggingFace...\n")

HF_TOKEN = os.environ.get('HF_TOKEN')

if HF_TOKEN:
    try:
        login(token=HF_TOKEN)
        print("‚úÖ HuggingFace authentication successful!")
        print("üîë Access granted to MedGemma models")
    except Exception as e:
        print(f"‚ùå Authentication failed: {e}")
        print("\n‚ö†Ô∏è  Common issues:")
        print("   1. HF access to MedGemma not approved yet")
        print("   2. Invalid token format")
        raise
else:
    print("‚ùå HF_TOKEN not found in secrets")
    raise ValueError("Add HF_TOKEN to Kaggle Secrets")

## üéÆ Step 5: Verify GPU

In [None]:
import torch

print("üéÆ GPU Verification\n")
print("="*50)

print(f"CUDA available: {torch.cuda.is_available()}")

if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
    gpu_memory = torch.cuda.get_device_properties(0).total_memory / 1e9
    print(f"GPU Memory: {gpu_memory:.2f} GB")
    print(f"CUDA Version: {torch.version.cuda}")
    print(f"PyTorch Version: {torch.__version__}")
    print("\n‚úÖ GPU ready for MedGemma!")
    
    if "T4" in torch.cuda.get_device_name(0):
        print("‚úÖ T4 detected - optimal for 4-bit quantized MedGemma")
    else:
        print(f"‚ö†Ô∏è  GPU is {torch.cuda.get_device_name(0)} (T4 recommended)")
        
else:
    print("‚ùå No GPU detected!")
    print("\nüìù Enable GPU:")
    print("   1. Kaggle Notebook Settings")
    print("   2. Accelerator ‚Üí GPU T4")
    print("   3. Save and restart kernel")
    raise RuntimeError("GPU required for MedGemma")

## üß™ Step 6: Test MedGemma Model Loading

In [None]:
%%time
print("üß™ Testing MedGemma model loading...\n")
print("‚è≥ This will take ~2-3 minutes (downloads + loads model)\n")
print("="*60)

from utils.model_loader import load_medgemma

try:
    # Load with 4-bit quantization
    model, processor = load_medgemma(
        model_name="google/medgemma-4b-it",
        quantize=True
    )
    
    print("\n" + "="*60)
    print("‚úÖ MedGemma loaded successfully!")
    print(f"üìä Model device: {model.device}")
    print(f"üíæ Model dtype: {model.dtype}")
    
    # Check memory usage
    if torch.cuda.is_available():
        allocated = torch.cuda.memory_allocated() / 1e9
        reserved = torch.cuda.memory_reserved() / 1e9
        print(f"üéØ GPU Memory Allocated: {allocated:.2f} GB")
        print(f"üéØ GPU Memory Reserved: {reserved:.2f} GB")
    
    print("\nüóëÔ∏è  Cleaning up test...")
    # Free memory for actual bot
    del model
    del processor
    torch.cuda.empty_cache()
    print("‚úÖ Test model unloaded, memory cleared")
    
except Exception as e:
    print(f"\n‚ùå Model loading failed: {e}")
    print("\nüîç Troubleshooting:")
    print("   1. Check HF access to MedGemma was approved")
    print("   2. Verify GPU is T4 (16GB VRAM)")
    print("   3. Ensure Internet is ON in Kaggle settings")
    print("   4. Try restarting the kernel")
    raise

## üåê Step 7: Setup ngrok Tunnel

In [None]:
from pyngrok import ngrok
import time

print("üåê Setting up ngrok tunnel...\n")

# Set ngrok auth token
NGROK_TOKEN = os.environ.get('NGROK_TOKEN')
ngrok.set_auth_token(NGROK_TOKEN)

# Kill any existing tunnels
ngrok.kill()
time.sleep(2)

# Start new tunnel on port 8080
print("üöÄ Starting ngrok tunnel on port 8080...")
public_url = ngrok.connect(8080)

print(f"\n‚úÖ ngrok tunnel active!")
print(f"üåê Public URL: {public_url}")

# Save for later use
os.environ['PUBLIC_URL'] = str(public_url)

# Show active tunnels
time.sleep(1)
tunnels = ngrok.get_tunnels()
print(f"\nüìä Active tunnels: {len(tunnels)}")
for tunnel in tunnels:
    print(f"   - {tunnel.public_url}")

## üöÄ Step 8: RUN DERMACHECK AI BOT!

**This cell will run continuously. Stop with: Kernel ‚Üí Interrupt**

In [None]:
print("üöÄ Starting DermaCheck AI Bot with MedGemma!")
print("="*60)
print("\n‚è≥ Loading MedGemma model (2-3 minutes)...")
print("üìä Monitor progress below:\n")

# Run the bot
!python telegram_bot_medgemma.py

# Note: This will run continuously
# Stop with: Kernel ‚Üí Interrupt
# Or close the notebook

## üìä Step 9: Monitor Logs (Optional - Run Separately)

In [None]:
# Tail logs in real-time
# Run this in a separate output if bot is running above
!tail -f bot.log

## üßπ Cleanup (Run After Demo)

In [None]:
print("üßπ Cleaning up...\n")

# Kill bot process
!pkill -f telegram_bot_medgemma
print("‚úÖ Bot process stopped")

# Kill ngrok
ngrok.kill()
print("‚úÖ ngrok tunnel closed")

# Clear GPU memory
if torch.cuda.is_available():
    torch.cuda.empty_cache()
    print("‚úÖ GPU memory cleared")

print("\nüéâ Cleanup complete!")

---

## üìù Troubleshooting Guide

### Model Loading Fails (403/404)
**Cause**: HuggingFace access not granted yet  
**Solution**: Wait for approval email from HuggingFace (~1-2 hours)

### Out of Memory Error
**Cause**: GPU VRAM insufficient  
**Solution**: Verify 4-bit quantization enabled (`quantize=True`)

### Bot Not Responding
**Check**:
1. Bot token correct (from @BotFather)
2. ngrok tunnel active (`ngrok.get_tunnels()`)
3. Check logs: `!tail bot.log`

### GPU Not Detected
**Solution**: Settings ‚Üí Accelerator ‚Üí GPU T4 ‚Üí Save ‚Üí Restart Kernel

---

## üéØ Performance Expectations

| Metric | Expected Value |
|--------|---------------|
| Model load | 2-3 minutes |
| Photo analysis | 15-20 seconds |
| Text consultation | 8-10 seconds |
| GPU memory usage | 12-14 GB |
| Concurrent users | 1-3 (demo) |

---

**Status**: Production Ready! üéâ  
**Compliance**: HAI-DEF ‚úÖ  
**Cost**: $0 ‚úÖ