# üè• Shifaa - Arabic Medical AI Platform

Welcome to Shifaa! This notebook demonstrates how to use all three modules:
1. **Datasets** - Access Arabic medical datasets
2. **RAG** - Medical information retrieval
3. **Vision** - Medical image analysis

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AhmedSeelim/shifaa/blob/main/Shifaa_Examples.ipynb)

---

**Package Information:**
- **GitHub:** https://github.com/yourusername/shifaa
- **HuggingFace:** https://huggingface.co/Ahmed-Selem
- **PyPI:** https://pypi.org/project/shifaa/

---

## üì¶ Installation

First, let's install the Shifaa package and its dependencies.

In [None]:
# Install Shifaa package
!pip install -q shifaa

print("‚úì Shifaa installed successfully!")

---

# üìä Module 1: Datasets

The Shifaa Datasets module provides easy access to curated Arabic medical datasets hosted on HuggingFace.

**Available Datasets:**
- Mental Health Consultations (35,648 consultations)
- Medical Consultations (84,422 consultations)

In [None]:
# Import the datasets module
from shifaa.datasets import (
    load_shifaa_mental_dataset,
    load_shifaa_medical_dataset,
    list_available_datasets
)

### List Available Datasets

Let's see what datasets are available.

In [None]:
# List all available datasets
datasets = list_available_datasets()

for name, info in datasets.items():
    print(f"\nüìä {name.upper()}")
    print(f"   Name: {info['name']}")
    print(f"   Size: {info['size']:,} consultations")
    print(f"   Specializations: {info['specializations']}")
    print(f"   Language: {info['language']}")

### Load Mental Health Dataset

Load and explore the Arabic Mental Health Consultations dataset.

In [None]:
# Load mental health dataset
print("Loading mental health dataset...")
mental_data = load_shifaa_mental_dataset()

print(f"‚úì Loaded {len(mental_data)} consultations")
print(f"\nDataset features: {mental_data.features}")
print(f"\nFirst consultation:")
print(mental_data[0])

### Load Medical Consultations Dataset

Load and explore the comprehensive Medical Consultations dataset.

In [None]:
# Load medical consultations dataset
print("Loading medical consultations dataset...")
medical_data = load_shifaa_medical_dataset()

print(f"‚úì Loaded {len(medical_data)} consultations")
print(f"\nDataset features: {medical_data.features}")
print(f"\nSample consultation:")
print(medical_data[0])

**Expected Output:**
- The dataset will be downloaded from HuggingFace (first time only)
- You'll see dataset statistics and a sample consultation
- Subsequent runs will use the cached dataset

---

# ü§ñ Module 2: Medical RAG System

The Shifaa RAG (Retrieval-Augmented Generation) system provides intelligent medical information retrieval using:
- Automatic specialty detection
- Semantic search over 84K+ consultations
- Medical insight extraction

**‚ö†Ô∏è Note:** You need a Google API key to use the RAG system. Get one from [Google AI Studio](https://makersuite.google.com/app/apikey).

In [None]:
# Set up Google API key
import os
from google.colab import userdata

# Option 1: Use Colab secrets (recommended)
# Go to the key icon (üîë) on the left sidebar and add GOOGLE_API_KEY
try:
    os.environ["GOOGLE_API_KEY"] = userdata.get('GOOGLE_API_KEY')
    print("‚úì API key loaded from Colab secrets")
except:
    # Option 2: Enter directly (less secure)
    api_key = input("Enter your Google API key: ")
    os.environ["GOOGLE_API_KEY"] = api_key
    print("‚úì API key set")

In [None]:
# Import RAG module
from shifaa.rag import MedicalRAGSystem

### Initialize RAG System

The system will automatically download the vector database on first use.

In [None]:
# Initialize Medical RAG System
print("Initializing Medical RAG System...")
print("(This will download the vector database on first use)")

rag = MedicalRAGSystem()

print("\n‚úì RAG system initialized successfully!")

### Query the RAG System

Let's ask a medical question in Arabic.

In [None]:
# Process a medical query
query = "ŸÖÿß ŸáŸä ÿ£ÿπÿ±ÿßÿ∂ ÿßŸÑÿ≥ŸÉÿ±Ÿäÿü"  # "What are the symptoms of diabetes?"

print(f"Query: {query}")
print("\nProcessing...\n")

results = rag.process_query(query)

if results:
    print("=" * 60)
    print("RESULTS")
    print("=" * 60)
    
    # Show detected specialties
    print("\n### Detected Specialties ###")
    for specialty in results.specialties:
        print(f"\n‚Ä¢ {specialty.specialty}")
        print(f"  {specialty.explanation}")
    
    # Show topic paths
    print("\n### Topic Paths ###")
    for topic in results.topic_paths:
        print(f"\n‚Ä¢ {topic.path}")
        print(f"  {topic.explanation}")
    
    # Show consultations
    print("\n### Retrieved Consultations ###")
    for i, consultation in enumerate(results.consultations, 1):
        print(f"\n[{i}] {consultation['metadata']['Question Title']}")
        print(f"    Similarity: {1 - consultation['distance']:.3f}")
        print(f"    Doctor: {consultation['metadata']['Doctor Name']}")
    
    # Show insights
    print("\n### Medical Insights ###")
    for i, insight in enumerate(results.insights, 1):
        print(f"\n[{i}] {insight.information}")
        print(f"    Relevance: {insight.relevance}")
else:
    print("‚ö† Query not recognized as medical")

### Try More Queries

Test with different medical questions.

In [None]:
# Try multiple queries
queries = [
    "ŸÉŸäŸÅ ÿ£ÿπÿßŸÑÿ¨ ÿßŸÑÿµÿØÿßÿπ ÿßŸÑŸÖÿ≤ŸÖŸÜÿü",  # How to treat chronic headaches?
    "ŸÖÿß ŸáŸä ÿ£ÿ≥ÿ®ÿßÿ® ÿ¢ŸÑÿßŸÖ ÿßŸÑŸÖÿπÿØÿ©ÿü",    # What causes stomach pain?
    "ŸÉŸäŸÅ ÿ£ÿ™ÿπÿßŸÖŸÑ ŸÖÿπ ÿßŸÑÿ£ÿ±ŸÇÿü"         # How to deal with insomnia?
]

for query in queries:
    print(f"\n{'='*60}")
    print(f"Query: {query}")
    print('='*60)
    
    results = rag.process_query(query)
    
    if results:
        print(f"\n‚úì Found {len(results.specialties)} specialties")
        print(f"‚úì Retrieved {len(results.consultations)} consultations")
        print(f"‚úì Extracted {len(results.insights)} insights")
        
        if results.insights:
            print(f"\nFirst insight: {results.insights[0].information[:100]}...")
    else:
        print("‚ö† Not recognized as medical query")

**Expected Output:**
- Detected medical specialties with explanations
- Relevant topic paths from the medical hierarchy
- Retrieved similar consultations with similarity scores
- Extracted medical insights relevant to the query

---

# üëÅÔ∏è Module 3: Vision - Medical Image Analysis

The Shifaa Vision module provides pre-trained models for medical image analysis:

**Classification Models:**
- Brain Tumor Detection
- COVID-19 Chest X-ray
- Diabetic Retinopathy
- Eye Disease Classification

**Segmentation Models:**
- Heart CT Segmentation
- Skin Cancer Segmentation
- Breast Cancer Segmentation

In [None]:
# Import vision module
from shifaa.vision import VisionModelFactory
import matplotlib.pyplot as plt

### List Available Models

See all available vision models.

In [None]:
# List all available vision models
models = VisionModelFactory.list_available_models()

print("üìä CLASSIFICATION MODELS")
print("=" * 60)
for name, info in models["classification"].items():
    status = "‚úì" if info.get('is_downloaded', False) else "‚úó"
    print(f"\n{status} {name}")
    print(f"   Architecture: {info['architecture']}")
    print(f"   Classes: {len(info['classes'])} ({', '.join(info['classes'][:3])}...)")
    print(f"   Input Size: {info['input_size']}")

print("\n\nüé≠ SEGMENTATION MODELS")
print("=" * 60)
for name, info in models["segmentation"].items():
    status = "‚úì" if info.get('is_downloaded', False) else "‚úó"
    print(f"\n{status} {name}")
    print(f"   Architecture: {info['architecture']}")
    print(f"   Task: {info.get('task', 'segmentation')}")
    print(f"   Input Size: {info['input_size']}")

### Classification Example: Brain Tumor Detection

Let's load a classification model and see how it works.

**Note:** For this demo, we'll show the code structure. You'll need actual medical images to run inference.

In [None]:
# Initialize Brain Tumor classification model
print("Loading Brain Tumor classification model...")
print("(Model will be downloaded from HuggingFace on first use)\n")

brain_model = VisionModelFactory.create_model(
    model_type="classification",
    model_name="Brain_Tumor"
)

print("‚úì Model loaded successfully!")

# Get model information
info = brain_model.get_info()
print(f"\nModel Information:")
print(f"  Architecture: {info['architecture']}")
print(f"  Classes: {', '.join(info['classes'])}")
print(f"  Input Size: {info['input_size']}")

In [None]:
# Example: How to use the model with an image
# Uncomment and provide your image path

# result = brain_model.run("path/to/brain_scan.jpg", show_image=True)
# print(f"\nPrediction: {result['predicted_class']}")
# print(f"Confidence: {result['confidence']:.2f}%")

print("\nüí° To use this model:")
print("1. Upload a brain MRI image to Colab")
print("2. Uncomment the code above")
print("3. Replace the path with your image path")
print("4. Run the cell to get predictions with visualization")

### Segmentation Example: Skin Cancer Segmentation

Now let's try a segmentation model.

In [None]:
# Initialize Skin Cancer segmentation model
print("Loading Skin Cancer segmentation model...")

skin_model = VisionModelFactory.create_model(
    model_type="segmentation",
    model_name="Skin_Cancer"
)

print("‚úì Model loaded successfully!")

# Get model information
info = skin_model.get_info()
print(f"\nModel Information:")
print(f"  Architecture: {info['architecture']}")
print(f"  Task: {info['task']}")
print(f"  Input Size: {info['input_size']}")

In [None]:
# Example: How to use the segmentation model
# Uncomment and provide your image path

# results = skin_model.run("path/to/skin_image.jpg", show_image=True)
# image = results["image"]
# mask = results["predicted_mask"]
#
# # Visualize results
# plt.figure(figsize=(12, 6))
# plt.subplot(1, 2, 1)
# plt.imshow(image)
# plt.title("Original Image")
# plt.axis('off')
#
# plt.subplot(1, 2, 2)
# plt.imshow(mask, cmap='gray')
# plt.title("Predicted Mask")
# plt.axis('off')
# plt.tight_layout()
# plt.show()

print("\nüí° To use this model:")
print("1. Upload a skin lesion image to Colab")
print("2. Uncomment the code above")
print("3. Replace the path with your image path")
print("4. Run the cell to see the segmentation results")

### Upload and Test Your Own Medical Images

Use this cell to upload your own medical images and test the models.

In [None]:
from google.colab import files
import os

# Upload files
print("Click 'Choose Files' to upload medical images")
uploaded = files.upload()

# List uploaded files
print("\nUploaded files:")
for filename in uploaded.keys():
    print(f"  ‚Ä¢ {filename}")

# Example: Process first uploaded image
if uploaded:
    first_image = list(uploaded.keys())[0]
    print(f"\nüí° Use this path in the models: '{first_image}'")

**Expected Output:**
- For classification: Predicted class and confidence score with visualization
- For segmentation: Original image and predicted segmentation mask side by side

---

## üéØ Summary

You've learned how to use all three Shifaa modules:

1. **Datasets** ‚úÖ
   - Load Arabic mental health and medical consultations
   - Access 120K+ curated medical consultations

2. **RAG System** ‚úÖ
   - Process medical queries in Arabic
   - Get relevant consultations and insights
   - Automatic specialty detection

3. **Vision** ‚úÖ
   - Classification: Detect diseases from medical images
   - Segmentation: Identify regions of interest
   - 7 pre-trained models ready to use

---

## üìö Next Steps

- **Documentation:** [Read the full docs](https://github.com/AhmedSeelim/shifaa)
- **HuggingFace:** [Browse datasets and models](https://huggingface.co/Ahmed-Selem)
- **Contribute:** [Join us on GitHub](https://github.com/AhmedSeelim/shifaa)

---

## ‚ö†Ô∏è Medical Disclaimer

**Important:** Shifaa is for research and educational purposes only. It is NOT intended for clinical diagnosis or treatment decisions. Always consult qualified healthcare professionals for medical advice.

---

Made with ‚ù§Ô∏è for the MENA healthcare community
