<a href="https://colab.research.google.com/github/Plutobi/Former/blob/main/Medical_Imaging_Tracker_Analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# üè• Medical Imaging Examination Tracker - Data Analysis & ML

This notebook provides:
- **Data Analysis**: Examination statistics and trends
- **Visualization**: Interactive dashboards and charts
- **Machine Learning**: Priority prediction and anomaly detection
- **Report Generation**: Automated summary reports

---
**‚ö†Ô∏è IMPORTANT**: This uses synthetic data only. Never use real patient data (PHI) in Colab.

---

## üì¶ Setup & Installation

Install required packages for data analysis and visualization.

In [1]:
# Install required packages
!pip install pandas numpy matplotlib seaborn plotly scikit-learn faker -q

print("‚úÖ All packages installed successfully!")

[?25l   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.0/2.0 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[91m‚ï∏[0m[90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m0.3/2.0 MB[0m [31m8.6 MB/s[0m eta [36m0:00:01[0m[2K   [91m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[91m‚ï∏[0m [32m2.0/2.0 MB[0m [31m27.7 MB/s[0m eta [36m0:00:01[0m[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m2.0/2.0 MB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
[?25h‚úÖ All packages installed successfully!


In [2]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
from datetime import datetime, timedelta
from faker import Faker
import random
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import warnings
warnings.filterwarnings('ignore')

# Set style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("‚úÖ Libraries imported successfully!")

‚úÖ Libraries imported successfully!


## üî¨ Generate Synthetic Medical Imaging Data

Create realistic examination data for analysis (100% synthetic, no real PHI).

In [3]:
def generate_medical_imaging_data(n_records=500):
    """
    Generate synthetic medical imaging examination data.

    Parameters:
    - n_records: Number of examination records to generate

    Returns:
    - DataFrame with synthetic examination data
    """
    fake = Faker()
    Faker.seed(42)
    random.seed(42)
    np.random.seed(42)

    # Exam types and their typical durations
    exam_types = {
        'X-Ray - Chest': {'duration': 15, 'abnormal_rate': 0.15},
        'X-Ray - Extremity': {'duration': 20, 'abnormal_rate': 0.20},
        'CT Scan - Head': {'duration': 30, 'abnormal_rate': 0.25},
        'CT Scan - Chest': {'duration': 35, 'abnormal_rate': 0.30},
        'CT Scan - Abdomen/Pelvis': {'duration': 45, 'abnormal_rate': 0.28},
        'MRI - Brain': {'duration': 60, 'abnormal_rate': 0.22},
        'MRI - Spine': {'duration': 50, 'abnormal_rate': 0.26},
        'MRI - Musculoskeletal': {'duration': 55, 'abnormal_rate': 0.24},
        'Ultrasound - Abdominal': {'duration': 25, 'abnormal_rate': 0.18},
        'Mammography': {'duration': 20, 'abnormal_rate': 0.12}
    }

    statuses = ['scheduled', 'in_progress', 'completed', 'results_ready']
    priorities = ['routine', 'urgent', 'critical']
    radiologists = [f"Dr. {fake.last_name()}" for _ in range(10)]

    data = []

    for i in range(n_records):
        exam_type = random.choice(list(exam_types.keys()))
        exam_info = exam_types[exam_type]

        # Generate timestamps
        scheduled_date = fake.date_time_between(start_date='-30d', end_date='now')

        # Status progression
        status = random.choice(statuses)

        # Completion time based on status
        if status in ['completed', 'results_ready']:
            completed_date = scheduled_date + timedelta(minutes=exam_info['duration'] + random.randint(-5, 15))
        else:
            completed_date = None

        # Priority (critical cases are less common)
        priority_weights = [0.6, 0.3, 0.1]  # routine, urgent, critical
        priority = random.choices(priorities, weights=priority_weights)[0]

        # Patient age distribution (realistic for imaging)
        age = int(np.random.normal(55, 18))
        age = max(1, min(95, age))  # Clamp between 1 and 95

        # Abnormal findings
        abnormal = random.random() < exam_info['abnormal_rate']

        # Generate features for ML
        wait_time = random.randint(5, 60)  # minutes
        time_of_day = scheduled_date.hour
        day_of_week = scheduled_date.weekday()

        record = {
            'exam_id': f'IMG-2024-{i+1:04d}',
            'patient_id': f'PT-{random.randint(10000, 99999)}',
            'patient_name': fake.name(),
            'age': age,
            'gender': random.choice(['M', 'F', 'Other']),
            'exam_type': exam_type,
            'modality': exam_type.split(' - ')[0],
            'body_part': exam_type.split(' - ')[1] if ' - ' in exam_type else exam_type,
            'status': status,
            'priority': priority,
            'scheduled_date': scheduled_date,
            'completed_date': completed_date,
            'duration_minutes': exam_info['duration'] + random.randint(-5, 10) if completed_date else None,
            'wait_time_minutes': wait_time,
            'abnormal': abnormal,
            'radiologist': random.choice(radiologists) if status in ['completed', 'results_ready'] else None,
            'ordering_physician': f"Dr. {fake.last_name()}",
            'time_of_day': time_of_day,
            'day_of_week': day_of_week,
            'is_weekend': day_of_week >= 5,
            'month': scheduled_date.month
        }

        data.append(record)

    df = pd.DataFrame(data)

    # Calculate turnaround time
    df['turnaround_hours'] = df.apply(
        lambda row: (row['completed_date'] - row['scheduled_date']).total_seconds() / 3600
        if row['completed_date'] else None,
        axis=1
    )

    return df

# Generate data
df = generate_medical_imaging_data(500)

print(f"‚úÖ Generated {len(df)} synthetic examination records")
print(f"\nDataset shape: {df.shape}")
print(f"\nColumns: {df.columns.tolist()}")
df.head()

‚úÖ Generated 500 synthetic examination records

Dataset shape: (500, 22)

Columns: ['exam_id', 'patient_id', 'patient_name', 'age', 'gender', 'exam_type', 'modality', 'body_part', 'status', 'priority', 'scheduled_date', 'completed_date', 'duration_minutes', 'wait_time_minutes', 'abnormal', 'radiologist', 'ordering_physician', 'time_of_day', 'day_of_week', 'is_weekend', 'month', 'turnaround_hours']


Unnamed: 0,exam_id,patient_id,patient_name,age,gender,exam_type,modality,body_part,status,priority,...,duration_minutes,wait_time_minutes,abnormal,radiologist,ordering_physician,time_of_day,day_of_week,is_weekend,month,turnaround_hours
0,IMG-2024-0001,PT-23434,Alyssa Gonzalez,63,Other,X-Ray - Extremity,X-Ray,Extremity,scheduled,urgent,...,,13,False,,Dr. Santos,7,6,True,1,
1,IMG-2024-0002,PT-38657,Kevin Pacheco,52,M,Ultrasound - Abdominal,Ultrasound,Abdominal,scheduled,routine,...,,10,True,,Dr. Smith,2,2,False,1,
2,IMG-2024-0003,PT-81426,Gina Moore,66,F,Ultrasound - Abdominal,Ultrasound,Abdominal,scheduled,routine,...,,49,False,,Dr. Bernard,22,2,False,2,
3,IMG-2024-0004,PT-30926,Brent Abbott,82,Other,CT Scan - Chest,CT Scan,Chest,results_ready,routine,...,43.0,53,False,Dr. Rhodes,Dr. Munoz,20,2,False,1,0.8
4,IMG-2024-0005,PT-22156,Kimberly Dudley,50,F,CT Scan - Abdomen/Pelvis,CT Scan,Abdomen/Pelvis,in_progress,routine,...,,11,False,,Dr. Gray,23,2,False,2,


## üìä Basic Statistics & Overview

In [4]:
# Summary statistics
print("="*60)
print("EXAMINATION OVERVIEW")
print("="*60)

print(f"\nüìÖ Date Range: {df['scheduled_date'].min().date()} to {df['scheduled_date'].max().date()}")
print(f"\nüë• Total Examinations: {len(df)}")
print(f"Unique Patients: {df['patient_id'].nunique()}")
print(f"\n‚ö° Status Distribution:")
print(df['status'].value_counts())
print(f"\nüéØ Priority Distribution:")
print(df['priority'].value_counts())
print(f"\n‚ö†Ô∏è Abnormal Findings: {df['abnormal'].sum()} ({df['abnormal'].mean()*100:.1f}%)")
print(f"\nüè• Examination Modalities:")
print(df['modality'].value_counts())

# Detailed statistics
print("\n" + "="*60)
print("PATIENT DEMOGRAPHICS")
print("="*60)
print(f"\nAge Statistics:")
print(df['age'].describe())
print(f"\nGender Distribution:")
print(df['gender'].value_counts())

EXAMINATION OVERVIEW

üìÖ Date Range: 2026-01-11 to 2026-02-10

üë• Total Examinations: 500
Unique Patients: 500

‚ö° Status Distribution:
status
scheduled        144
completed        125
in_progress      122
results_ready    109
Name: count, dtype: int64

üéØ Priority Distribution:
priority
routine     278
urgent      159
critical     63
Name: count, dtype: int64

‚ö†Ô∏è Abnormal Findings: 98 (19.6%)

üè• Examination Modalities:
modality
MRI            154
CT Scan        135
X-Ray          109
Mammography     65
Ultrasound      37
Name: count, dtype: int64

PATIENT DEMOGRAPHICS

Age Statistics:
count    500.000000
mean      54.526000
std       17.296811
min        1.000000
25%       42.000000
50%       55.000000
75%       66.000000
max       95.000000
Name: age, dtype: float64

Gender Distribution:
gender
Other    178
M        161
F        161
Name: count, dtype: int64


## üìà Interactive Visualizations

In [5]:
# Create subplot figure
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=('Examination Status Distribution', 'Priority Levels',
                    'Modality Distribution', 'Abnormal Findings by Priority'),
    specs=[[{'type':'bar'}, {'type':'pie'}],
           [{'type':'bar'}, {'type':'bar'}]]
)

# Status distribution
status_counts = df['status'].value_counts()
colors_status = {'scheduled': '#64748b', 'in_progress': '#3b82f6',
                'completed': '#f59e0b', 'results_ready': '#10b981'}
fig.add_trace(
    go.Bar(x=status_counts.index, y=status_counts.values,
           marker_color=[colors_status.get(s, '#gray') for s in status_counts.index],
           name='Status'),
    row=1, col=1
)

# Priority pie chart
priority_counts = df['priority'].value_counts()
colors_priority = {'critical': '#dc2626', 'urgent': '#f59e0b', 'routine': '#64748b'}
fig.add_trace(
    go.Pie(labels=priority_counts.index, values=priority_counts.values,
           marker_colors=[colors_priority.get(p, '#gray') for p in priority_counts.index],
           name='Priority'),
    row=1, col=2
)

# Modality distribution
modality_counts = df['modality'].value_counts()
fig.add_trace(
    go.Bar(x=modality_counts.index, y=modality_counts.values,
           marker_color='#06b6d4', name='Modality'),
    row=2, col=1
)

# Abnormal findings by priority
abnormal_by_priority = df.groupby('priority')['abnormal'].sum()
fig.add_trace(
    go.Bar(x=abnormal_by_priority.index, y=abnormal_by_priority.values,
           marker_color='#dc2626', name='Abnormal'),
    row=2, col=2
)

fig.update_layout(height=800, showlegend=False,
                  title_text="Medical Imaging Dashboard Overview")
fig.show()

In [6]:
# Timeline visualization
df_sorted = df.sort_values('scheduled_date')
df_sorted['date'] = df_sorted['scheduled_date'].dt.date

# Exams per day
exams_per_day = df_sorted.groupby('date').size().reset_index(name='count')

fig = go.Figure()
fig.add_trace(go.Scatter(
    x=exams_per_day['date'],
    y=exams_per_day['count'],
    mode='lines+markers',
    name='Examinations',
    line=dict(color='#3b82f6', width=2),
    marker=dict(size=8)
))

fig.update_layout(
    title='Daily Examination Volume',
    xaxis_title='Date',
    yaxis_title='Number of Examinations',
    height=400,
    hovermode='x unified'
)
fig.show()

In [7]:
# Age distribution with abnormal findings
fig = go.Figure()

# Normal findings
fig.add_trace(go.Histogram(
    x=df[~df['abnormal']]['age'],
    name='Normal',
    marker_color='#10b981',
    opacity=0.7,
    nbinsx=20
))

# Abnormal findings
fig.add_trace(go.Histogram(
    x=df[df['abnormal']]['age'],
    name='Abnormal',
    marker_color='#dc2626',
    opacity=0.7,
    nbinsx=20
))

fig.update_layout(
    title='Patient Age Distribution by Finding Type',
    xaxis_title='Age',
    yaxis_title='Count',
    barmode='overlay',
    height=400
)
fig.show()

In [8]:
# Turnaround time analysis
df_completed = df[df['turnaround_hours'].notna()]

fig = go.Figure()

for priority in ['routine', 'urgent', 'critical']:
    data = df_completed[df_completed['priority'] == priority]['turnaround_hours']
    colors_map = {'critical': '#dc2626', 'urgent': '#f59e0b', 'routine': '#64748b'}

    fig.add_trace(go.Box(
        y=data,
        name=priority.capitalize(),
        marker_color=colors_map[priority]
    ))

fig.update_layout(
    title='Turnaround Time by Priority Level',
    yaxis_title='Hours',
    height=400,
    showlegend=True
)
fig.show()

print("\nüìä Turnaround Time Statistics (hours):")
print(df_completed.groupby('priority')['turnaround_hours'].describe())


üìä Turnaround Time Statistics (hours):
          count      mean       std       min       25%       50%       75%  \
priority                                                                      
critical   29.0  0.711494  0.276855  0.183333  0.566667  0.750000  0.950000   
routine   123.0  0.702033  0.291232  0.166667  0.450000  0.716667  0.958333   
urgent     82.0  0.646138  0.282167  0.200000  0.404167  0.566667  0.929167   

               max  
priority            
critical  1.233333  
routine   1.250000  
urgent    1.233333  


## ü§ñ Machine Learning: Priority Prediction

Train a model to predict examination priority based on patient and exam characteristics.

In [9]:
# Prepare data for ML
df_ml = df.copy()

# Encode categorical variables
from sklearn.preprocessing import LabelEncoder

le_modality = LabelEncoder()
le_gender = LabelEncoder()

df_ml['modality_encoded'] = le_modality.fit_transform(df_ml['modality'])
df_ml['gender_encoded'] = le_gender.fit_transform(df_ml['gender'])

# Select features
features = ['age', 'modality_encoded', 'gender_encoded', 'time_of_day',
            'day_of_week', 'is_weekend', 'wait_time_minutes', 'abnormal']

X = df_ml[features]
y = df_ml['priority']

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

print(f"‚úÖ Training set: {len(X_train)} samples")
print(f"‚úÖ Test set: {len(X_test)} samples")
print(f"\nFeatures used: {features}")

‚úÖ Training set: 400 samples
‚úÖ Test set: 100 samples

Features used: ['age', 'modality_encoded', 'gender_encoded', 'time_of_day', 'day_of_week', 'is_weekend', 'wait_time_minutes', 'abnormal']


In [10]:
# Train Random Forest classifier
print("Training Random Forest model...\n")

clf = RandomForestClassifier(n_estimators=100, random_state=42, max_depth=10)
clf.fit(X_train, y_train)

# Make predictions
y_pred = clf.predict(X_test)

# Evaluate
print("="*60)
print("MODEL PERFORMANCE")
print("="*60)
print(f"\nAccuracy: {clf.score(X_test, y_test):.3f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred))

Training Random Forest model...

MODEL PERFORMANCE

Accuracy: 0.540

Classification Report:
              precision    recall  f1-score   support

    critical       0.00      0.00      0.00        15
     routine       0.61      0.84      0.70        61
      urgent       0.20      0.12      0.15        24

    accuracy                           0.54       100
   macro avg       0.27      0.32      0.29       100
weighted avg       0.42      0.54      0.47       100



In [11]:
# Feature importance
feature_importance = pd.DataFrame({
    'feature': features,
    'importance': clf.feature_importances_
}).sort_values('importance', ascending=False)

fig = go.Figure(go.Bar(
    x=feature_importance['importance'],
    y=feature_importance['feature'],
    orientation='h',
    marker_color='#3b82f6'
))

fig.update_layout(
    title='Feature Importance for Priority Prediction',
    xaxis_title='Importance',
    yaxis_title='Feature',
    height=400
)
fig.show()

print("\nüìä Feature Importance Rankings:")
print(feature_importance)


üìä Feature Importance Rankings:
             feature  importance
6  wait_time_minutes    0.236620
0                age    0.236015
3        time_of_day    0.194992
4        day_of_week    0.110754
1   modality_encoded    0.092062
2     gender_encoded    0.075933
7           abnormal    0.032232
5         is_weekend    0.021391


In [12]:
# Confusion matrix
cm = confusion_matrix(y_test, y_pred, labels=['routine', 'urgent', 'critical'])

fig = go.Figure(data=go.Heatmap(
    z=cm,
    x=['Routine', 'Urgent', 'Critical'],
    y=['Routine', 'Urgent', 'Critical'],
    colorscale='Blues',
    text=cm,
    texttemplate='%{text}',
    textfont={"size": 16}
))

fig.update_layout(
    title='Confusion Matrix - Priority Prediction',
    xaxis_title='Predicted',
    yaxis_title='Actual',
    height=500
)
fig.show()

## üîç Anomaly Detection: Unusual Wait Times

In [13]:
from sklearn.ensemble import IsolationForest

# Prepare data for anomaly detection
df_anomaly = df[df['wait_time_minutes'].notna()][['wait_time_minutes', 'age']].copy()

# Train Isolation Forest
iso_forest = IsolationForest(contamination=0.1, random_state=42)
df_anomaly['anomaly'] = iso_forest.fit_predict(df_anomaly)

# -1 for anomalies, 1 for normal
anomalies = df_anomaly[df_anomaly['anomaly'] == -1]

print(f"üîç Detected {len(anomalies)} anomalous wait times out of {len(df_anomaly)} examinations")
print(f"\nAnomaly rate: {len(anomalies)/len(df_anomaly)*100:.2f}%")

# Visualize anomalies
fig = go.Figure()

# Normal points
normal = df_anomaly[df_anomaly['anomaly'] == 1]
fig.add_trace(go.Scatter(
    x=normal['age'],
    y=normal['wait_time_minutes'],
    mode='markers',
    name='Normal',
    marker=dict(color='#10b981', size=8, opacity=0.6)
))

# Anomaly points
fig.add_trace(go.Scatter(
    x=anomalies['age'],
    y=anomalies['wait_time_minutes'],
    mode='markers',
    name='Anomaly',
    marker=dict(color='#dc2626', size=12, symbol='x', line=dict(width=2))
))

fig.update_layout(
    title='Anomaly Detection: Unusual Wait Times',
    xaxis_title='Patient Age',
    yaxis_title='Wait Time (minutes)',
    height=500
)
fig.show()

üîç Detected 50 anomalous wait times out of 500 examinations

Anomaly rate: 10.00%


## üìä Advanced Analytics

In [14]:
# Radiologist performance analysis
radiologist_stats = df[df['radiologist'].notna()].groupby('radiologist').agg({
    'exam_id': 'count',
    'abnormal': 'sum',
    'turnaround_hours': 'mean'
}).rename(columns={
    'exam_id': 'total_exams',
    'abnormal': 'abnormal_findings',
    'turnaround_hours': 'avg_turnaround_hours'
}).sort_values('total_exams', ascending=False)

radiologist_stats['abnormal_rate'] = (radiologist_stats['abnormal_findings'] /
                                       radiologist_stats['total_exams'] * 100)

print("="*80)
print("RADIOLOGIST PERFORMANCE METRICS")
print("="*80)
print(radiologist_stats.round(2))

# Visualize
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=('Exams per Radiologist', 'Average Turnaround Time')
)

fig.add_trace(
    go.Bar(x=radiologist_stats.index, y=radiologist_stats['total_exams'],
           marker_color='#3b82f6', name='Total Exams'),
    row=1, col=1
)

fig.add_trace(
    go.Bar(x=radiologist_stats.index, y=radiologist_stats['avg_turnaround_hours'],
           marker_color='#f59e0b', name='Avg Hours'),
    row=1, col=2
)

fig.update_xaxes(tickangle=45)
fig.update_layout(height=400, showlegend=False)
fig.show()

RADIOLOGIST PERFORMANCE METRICS
               total_exams  abnormal_findings  avg_turnaround_hours  \
radiologist                                                           
Dr. Johnson             55                 10                  0.66   
Dr. Miller              32                  4                  0.72   
Dr. Doyle               28                  3                  0.72   
Dr. Henderson           26                  8                  0.70   
Dr. Mcclain             23                  7                  0.61   
Dr. Hill                21                  3                  0.73   
Dr. Rhodes              18                  4                  0.71   
Dr. Fowler              17                  5                  0.65   
Dr. Walker              14                  5                  0.62   

               abnormal_rate  
radiologist                   
Dr. Johnson            18.18  
Dr. Miller             12.50  
Dr. Doyle              10.71  
Dr. Henderson          30.77  


In [15]:
# Time-based patterns
hourly_distribution = df.groupby('time_of_day').size()
weekly_distribution = df.groupby('day_of_week').size()

day_names = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=('Examinations by Hour of Day', 'Examinations by Day of Week')
)

fig.add_trace(
    go.Bar(x=hourly_distribution.index, y=hourly_distribution.values,
           marker_color='#06b6d4', name='By Hour'),
    row=1, col=1
)

fig.add_trace(
    go.Bar(x=[day_names[i] for i in weekly_distribution.index],
           y=weekly_distribution.values,
           marker_color='#8b5cf6', name='By Day'),
    row=1, col=2
)

fig.update_xaxes(title_text="Hour", row=1, col=1)
fig.update_xaxes(title_text="Day", tickangle=45, row=1, col=2)
fig.update_yaxes(title_text="Count", row=1, col=1)
fig.update_layout(height=400, showlegend=False)
fig.show()

print("\nüìÖ Peak Hours:")
print(f"Busiest hour: {hourly_distribution.idxmax()}:00 ({hourly_distribution.max()} exams)")
print(f"\nüìÖ Peak Days:")
peak_day_idx = weekly_distribution.idxmax()
print(f"Busiest day: {day_names[peak_day_idx]} ({weekly_distribution.max()} exams)")


üìÖ Peak Hours:
Busiest hour: 22:00 (29 exams)

üìÖ Peak Days:
Busiest day: Monday (90 exams)


## üìÑ Generate Executive Summary Report

In [16]:
def generate_executive_summary(df):
    """
    Generate a comprehensive executive summary report.
    """
    report = f"""
{'='*80}
MEDICAL IMAGING DEPARTMENT - EXECUTIVE SUMMARY REPORT
{'='*80}

Report Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}
Period: {df['scheduled_date'].min().date()} to {df['scheduled_date'].max().date()}

{'='*80}
KEY PERFORMANCE INDICATORS
{'='*80}

üìä VOLUME METRICS
  ‚Ä¢ Total Examinations: {len(df):,}
  ‚Ä¢ Unique Patients: {df['patient_id'].nunique():,}
  ‚Ä¢ Average Daily Volume: {len(df) / df['date'].nunique():.1f}
  ‚Ä¢ Completed Exams: {(df['status'].isin(['completed', 'results_ready'])).sum():,} ({(df['status'].isin(['completed', 'results_ready'])).sum()/len(df)*100:.1f}%)

üö® PRIORITY BREAKDOWN
  ‚Ä¢ Critical: {(df['priority'] == 'critical').sum()} ({(df['priority'] == 'critical').sum()/len(df)*100:.1f}%)
  ‚Ä¢ Urgent: {(df['priority'] == 'urgent').sum()} ({(df['priority'] == 'urgent').sum()/len(df)*100:.1f}%)
  ‚Ä¢ Routine: {(df['priority'] == 'routine').sum()} ({(df['priority'] == 'routine').sum()/len(df)*100:.1f}%)

‚ö†Ô∏è CLINICAL FINDINGS
  ‚Ä¢ Abnormal Findings: {df['abnormal'].sum()} ({df['abnormal'].mean()*100:.1f}%)
  ‚Ä¢ Normal Results: {(~df['abnormal']).sum()} ({(~df['abnormal']).sum()/len(df)*100:.1f}%)

‚è±Ô∏è TURNAROUND TIME (Completed Exams)
  ‚Ä¢ Average: {df['turnaround_hours'].mean():.2f} hours
  ‚Ä¢ Median: {df['turnaround_hours'].median():.2f} hours
  ‚Ä¢ Critical Cases Avg: {df[df['priority']=='critical']['turnaround_hours'].mean():.2f} hours

üè• MODALITY UTILIZATION
"""

    for modality, count in df['modality'].value_counts().items():
        report += f"  ‚Ä¢ {modality}: {count} ({count/len(df)*100:.1f}%)\n"

    report += f"""
{'='*80}
OPERATIONAL INSIGHTS
{'='*80}

üìà TRENDS
  ‚Ä¢ Peak Operating Hour: {df.groupby('time_of_day').size().idxmax()}:00
  ‚Ä¢ Busiest Day: {['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'][df.groupby('day_of_week').size().idxmax()]}
  ‚Ä¢ Weekend Volume: {df['is_weekend'].sum()} exams ({df['is_weekend'].sum()/len(df)*100:.1f}%)

üë• STAFFING
  ‚Ä¢ Active Radiologists: {df['radiologist'].nunique()}
  ‚Ä¢ Avg Exams per Radiologist: {df[df['radiologist'].notna()].groupby('radiologist').size().mean():.1f}

‚ö° EFFICIENCY METRICS
  ‚Ä¢ Average Wait Time: {df['wait_time_minutes'].mean():.1f} minutes
  ‚Ä¢ Same-Day Completion Rate: {(df['turnaround_hours'] <= 24).sum() / df['turnaround_hours'].notna().sum() * 100:.1f}%

{'='*80}
RECOMMENDATIONS
{'='*80}

1. üìä Resource Allocation:
   - Consider additional staffing during peak hours ({df.groupby('time_of_day').size().idxmax()}:00)
   - Weekend coverage optimization may be needed

2. ‚è±Ô∏è Turnaround Time:
   - Critical cases average {df[df['priority']=='critical']['turnaround_hours'].mean():.1f}h turnaround
   - Target: < 4 hours for critical findings

3. üîç Quality Assurance:
   - Abnormal finding rate at {df['abnormal'].mean()*100:.1f}%
   - Continue monitoring for consistency

4. üìà Capacity Planning:
   - Daily average: {len(df) / df['date'].nunique():.1f} exams
   - Consider expansion if consistently at capacity

{'='*80}
END OF REPORT
{'='*80}
"""

    return report

# Generate and display report
df['date'] = df['scheduled_date'].dt.date
executive_summary = generate_executive_summary(df)
print(executive_summary)


MEDICAL IMAGING DEPARTMENT - EXECUTIVE SUMMARY REPORT

Report Generated: 2026-02-10 18:35:14
Period: 2026-01-11 to 2026-02-10

KEY PERFORMANCE INDICATORS

üìä VOLUME METRICS
  ‚Ä¢ Total Examinations: 500
  ‚Ä¢ Unique Patients: 500
  ‚Ä¢ Average Daily Volume: 16.1
  ‚Ä¢ Completed Exams: 234 (46.8%)

üö® PRIORITY BREAKDOWN
  ‚Ä¢ Critical: 63 (12.6%)
  ‚Ä¢ Urgent: 159 (31.8%)
  ‚Ä¢ Routine: 278 (55.6%)

‚ö†Ô∏è CLINICAL FINDINGS
  ‚Ä¢ Abnormal Findings: 98 (19.6%)
  ‚Ä¢ Normal Results: 402 (80.4%)

‚è±Ô∏è TURNAROUND TIME (Completed Exams)
  ‚Ä¢ Average: 0.68 hours
  ‚Ä¢ Median: 0.65 hours
  ‚Ä¢ Critical Cases Avg: 0.71 hours

üè• MODALITY UTILIZATION
  ‚Ä¢ MRI: 154 (30.8%)
  ‚Ä¢ CT Scan: 135 (27.0%)
  ‚Ä¢ X-Ray: 109 (21.8%)
  ‚Ä¢ Mammography: 65 (13.0%)
  ‚Ä¢ Ultrasound: 37 (7.4%)

OPERATIONAL INSIGHTS

üìà TRENDS
  ‚Ä¢ Peak Operating Hour: 22:00
  ‚Ä¢ Busiest Day: Monday
  ‚Ä¢ Weekend Volume: 132 exams (26.4%)

üë• STAFFING
  ‚Ä¢ Active Radiologists: 9
  ‚Ä¢ Avg Exams per Radiologis

## üíæ Export Data & Results

In [17]:
# Export to CSV
df.to_csv('medical_imaging_examinations.csv', index=False)
print("‚úÖ Data exported to 'medical_imaging_examinations.csv'")

# Export summary statistics
summary_stats = pd.DataFrame({
    'Metric': [
        'Total Examinations',
        'Unique Patients',
        'Critical Priority',
        'Abnormal Findings',
        'Avg Turnaround (hours)',
        'Completion Rate'
    ],
    'Value': [
        len(df),
        df['patient_id'].nunique(),
        (df['priority'] == 'critical').sum(),
        df['abnormal'].sum(),
        f"{df['turnaround_hours'].mean():.2f}",
        f"{(df['status'].isin(['completed', 'results_ready'])).sum()/len(df)*100:.1f}%"
    ]
})

summary_stats.to_csv('summary_statistics.csv', index=False)
print("‚úÖ Summary statistics exported to 'summary_statistics.csv'")

# Save executive report
with open('executive_summary.txt', 'w') as f:
    f.write(executive_summary)
print("‚úÖ Executive summary saved to 'executive_summary.txt'")

print("\nüì¶ All files ready for download!")

‚úÖ Data exported to 'medical_imaging_examinations.csv'
‚úÖ Summary statistics exported to 'summary_statistics.csv'
‚úÖ Executive summary saved to 'executive_summary.txt'

üì¶ All files ready for download!
