# Pathology Analysis Tutorial

This notebook demonstrates how to use the `onem_path` module for dual-mode pathology image analysis: traditional CellProfiler-based radiomics and deep transfer learning with TITAN model.

## üìã Table of Contents
1. [Setup and Imports](#setup)
2. [CellProfiler Feature Extraction](#cellprofiler)
3. [TITAN Deep Learning Features](#titan)
4. [Combined Feature Analysis](#combined)
5. [WSI Processing](#wsi)
6. [Feature Fusion and Selection](#fusion)
7. [Comparative Analysis](#comparison)

## üîß Setup and Imports {#setup}

In [None]:
# Core imports
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Image processing imports
from PIL import Image
import cv2
import openslide

# Add project root to path
project_root = Path().absolute().parent
sys.path.append(str(project_root))

# Import onem_path modules
from onem_path import PathologyAnalyzer
from onem_path.extractors.cellprofiler_extractor import CellProfilerExtractor
from onem_path.extractors.titan_extractor import TITANExtractor
from onem_path.config.settings import get_preset_config

# Set plotting style
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")

print("‚úÖ All modules imported successfully!")
print(f"Project root: {project_root}")

## üß™ CellProfiler Feature Extraction {#cellprofiler}

In [None]:
# Initialize pathology analyzer
analyzer = PathologyAnalyzer()
print("üß™ Pathology analyzer initialized")

# Setup paths
image_dir = "sample_data/pathology_images/"
output_dir = "output/pathology_features/"

# Create output directory
os.makedirs(output_dir, exist_ok=True)

# Check if sample data exists
if os.path.exists(image_dir):
    print(f"üìÅ Processing pathology images from: {image_dir}")
    
    # Extract CellProfiler features
    print("üöÄ Extracting CellProfiler features...")
    
    cp_features = analyzer.extract_features(
        image_dir=image_dir,
        method='cellprofiler',
        config_name='default',
        output_csv=os.path.join(output_dir, 'cellprofiler_features.csv'),
        parallel=True,
        n_workers=4
    )
    
    print(f"‚úÖ CellProfiler extraction completed!")
    print(f"üìä Features saved to: {output_dir}/cellprofiler_features.csv")
    
    if cp_features:
        # Convert to DataFrame for analysis
        cp_df = pd.DataFrame(cp_features).T  # Transpose to have features as columns
        cp_df.index.name = 'ImageID'
        cp_df.reset_index(inplace=True)
        
        print(f"üìà Processed {len(cp_df)} images")
        print(f"üìã Extracted {len(cp_df.columns) - 1} feature types")  # -1 for ImageID
        
        # Display feature categories
        feature_categories = {
            'Nuclear Features': [col for col in cp_df.columns if 'nuclear' in col.lower()],
            'Cellular Features': [col for col in cp_df.columns if 'cell' in col.lower()],
            'Texture Features': [col for col in cp_df.columns if any(tex in col.lower() 
                               for tex in ['texture', 'haralick', 'glcm'])],
            'Morphological': [col for col in cp_df.columns if any(morph in col.lower() 
                              for morph in ['area', 'perimeter', 'shape', 'circularity'])]
        }
        
        print("\nüîç Feature Categories:")
        for category, features in feature_categories.items():
            print(f"  {category}: {len(features)} features")
            if features:
                print(f"    Sample: {features[:3]}...")
else:
    print(f"‚ö†Ô∏è  Sample directory not found: {image_dir}")
    print("Please replace with your actual pathology image directory")
    
# Create dummy CellProfiler data for demonstration
print("\nüé≠ Creating dummy CellProfiler features for demonstration...")
dummy_cp_data = {
    'Image_001': {
        'nuclear_area_mean': 45.2, 'nuclear_perimeter_mean': 23.8, 'nuclear_circularity_mean': 0.78,
        'cell_area_mean': 89.5, 'cell_perimeter_mean': 34.2, 'cell_eccentricity_mean': 0.65,
        'texture_glcm_contrast': 0.23, 'texture_glcm_homogeneity': 0.87, 'texture_glcm_entropy': 1.45,
        'morphological_solidity': 0.92, 'morphological_extent': 0.68, 'morphological_aspect_ratio': 1.23
    },
    'Image_002': {
        'nuclear_area_mean': 52.8, 'nuclear_perimeter_mean': 25.4, 'nuclear_circularity_mean': 0.81,
        'cell_area_mean': 95.3, 'cell_perimeter_mean': 36.8, 'cell_eccentricity_mean': 0.59,
        'texture_glcm_contrast': 0.31, 'texture_glcm_homogeneity': 0.82, 'texture_glcm_entropy': 1.67,
        'morphological_solidity': 0.89, 'morphological_extent': 0.71, 'morphological_aspect_ratio': 1.18
    },
    'Image_003': {
        'nuclear_area_mean': 38.9, 'nuclear_perimeter_mean': 21.2, 'nuclear_circularity_mean': 0.74,
        'cell_area_mean': 78.6, 'cell_perimeter_mean': 30.5, 'cell_eccentricity_mean': 0.71,
        'texture_glcm_contrast': 0.18, 'texture_glcm_homogeneity': 0.91, 'texture_glcm_entropy': 1.23,
        'morphological_solidity': 0.94, 'morphological_extent': 0.65, 'morphological_aspect_ratio': 1.35
    }
}

cp_df = pd.DataFrame.from_dict(dummy_cp_data, orient='index')
cp_df.index.name = 'ImageID'
cp_df.reset_index(inplace=True)

print(f"üìä Dummy CellProfiler features created: {len(cp_df)} images, {len(cp_df.columns) - 1} features")
print("\nüëÄ Sample CellProfiler features:")
display(cp_df.head())

## ü§ñ TITAN Deep Learning Features {#titan}

In [None]:
# Extract TITAN deep learning features
if os.path.exists(image_dir):
    print("üöÄ Extracting TITAN deep learning features...")
    
    titan_features = analyzer.extract_features(
        image_dir=image_dir,
        method='titan',
        config_name='titan_pretrained',
        output_csv=os.path.join(output_dir, 'titan_features.csv'),
        model_type='resnet50',  # or 'efficientnet', 'vit', etc.
        feature_layer='last_conv',  # Extract from last convolutional layer
        parallel=True,
        n_workers=2  # TITAN uses more memory, so fewer workers
    )
    
    print(f"‚úÖ TITAN extraction completed!")
    print(f"üìä Features saved to: {output_dir}/titan_features.csv")
    
    if titan_features:
        # Convert to DataFrame
        titan_df = pd.DataFrame(titan_features).T
        titan_df.index.name = 'ImageID'
        titan_df.reset_index(inplace=True)
        
        print(f"üìà Processed {len(titan_df)} images")
        print(f"üìã Extracted {len(titan_df.columns) - 1} deep features")
        
        # Analyze deep feature dimensions
        feature_cols = [col for col in titan_df.columns if col != 'ImageID']
        print(f"\nüîç Deep Feature Analysis:")
        print(f"  Feature dimension: {len(feature_cols)}")
        print(f"  Feature range: [{titan_df[feature_cols].min().min():.4f}, {titan_df[feature_cols].max().max():.4f}]")
        print(f"  Feature mean: {titan_df[feature_cols].mean().mean():.4f}")
        print(f"  Feature std: {titan_df[feature_cols].std().mean():.4f}")
else:
    print(f"‚ö†Ô∏è  Sample directory not found: {image_dir}")
    
# Create dummy TITAN features for demonstration
print("\nüé≠ Creating dummy TITAN features for demonstration...")
np.random.seed(42)  # For reproducibility

dummy_titan_data = {}
feature_dim = 2048  # Typical ResNet50 feature dimension

for i, image_id in enumerate(cp_df['ImageID']):
    # Create realistic deep features with some structure
    base_features = np.random.randn(feature_dim) * 0.1
    
    # Add some image-specific patterns
    if i == 0:  # Image_001
        base_features[:100] += 0.3
        base_features[100:200] -= 0.2
    elif i == 1:  # Image_002
        base_features[:100] -= 0.2
        base_features[200:300] += 0.4
    else:  # Image_003
        base_features[300:400] += 0.3
        base_features[400:500] -= 0.1
    
    dummy_titan_data[image_id] = base_features.tolist()

titan_df = pd.DataFrame.from_dict(dummy_titan_data, orient='index')
titan_df.index.name = 'ImageID'
titan_df.reset_index(inplace=True)

print(f"üìä Dummy TITAN features created: {len(titan_df)} images, {len(titan_df.columns) - 1} features")
print("\nüëÄ Sample TITAN features (first 10 dimensions):")
display(titan_df[['ImageID'] + [col for col in titan_df.columns if col != 'ImageID'][:10]].head())

## üîó Combined Feature Analysis {#combined}

In [None]:
# Combine CellProfiler and TITAN features
if 'cp_df' in locals() and 'titan_df' in locals():
    print("üîó Combining CellProfiler and TITAN features...")
    
    # Merge on ImageID
    combined_df = pd.merge(cp_df, titan_df, on='ImageID', suffixes=('_cp', '_titan'))
    
    print(f"‚úÖ Features combined successfully!")
    print(f"üìä Combined dataset: {len(combined_df)} images, {len(combined_df.columns) - 1} total features")
    
    # Separate feature types
    cp_features = [col for col in combined_df.columns if col.endswith('_cp') or col == 'ImageID']
    titan_features = [col for col in combined_df.columns if col.endswith('_titan')]
    
    print(f"\nüîç Feature Breakdown:")
    print(f"  CellProfiler features: {len(cp_features) - 1}")  # -1 for ImageID
    print(f"  TITAN features: {len(titan_features)}")
    print(f"  Total features: {len(combined_df.columns) - 1}")
    
    # Feature statistics comparison
    print(f"\nüìà Feature Statistics Comparison:")
    
    # CellProfiler stats
    cp_cols = [col for col in cp_features if col != 'ImageID']
    if cp_cols:
        cp_stats = {
            'Mean': combined_df[cp_cols].mean().mean(),
            'Std': combined_df[cp_cols].std().mean(),
            'Min': combined_df[cp_cols].min().min(),
            'Max': combined_df[cp_cols].max().max()
        }
        print(f"\n  CellProfiler Features:")
        for stat, value in cp_stats.items():
            print(f"    {stat}: {value:.4f}")
    
    # TITAN stats
    if titan_features:
        titan_stats = {
            'Mean': combined_df[titan_features].mean().mean(),
            'Std': combined_df[titan_features].std().mean(),
            'Min': combined_df[titan_features].min().min(),
            'Max': combined_df[titan_features].max().max()
        }
        print(f"\n  TITAN Features:")
        for stat, value in titan_stats.items():
            print(f"    {stat}: {value:.4f}")
    
    # Correlation between feature types
    print(f"\nüîó Cross-Feature Type Correlation:")
    
    # Compute average correlation between CP and TITAN features
    if cp_cols and titan_features:
        cross_correlations = []
        for cp_col in cp_cols[:5]:  # Sample to avoid computation explosion
            for titan_col in titan_features[:5]:
                corr = combined_df[cp_col].corr(combined_df[titan_col])
                cross_correlations.append(corr)
        
        avg_cross_corr = np.mean(cross_correlations)
        print(f"  Average CP-TITAN correlation: {avg_cross_corr:.4f}")
else:
    print("‚ö†Ô∏è  Cannot combine features - missing data")
    
# Save combined features
if 'combined_df' in locals():
    combined_output = os.path.join(output_dir, 'combined_pathology_features.csv')
    combined_df.to_csv(combined_output, index=False)
    print(f"\nüíæ Combined features saved to: {combined_output}")

## üî¨ WSI Processing {#wsi}

In [None]:
# Demonstrate Whole Slide Image processing
wsi_path = "sample_data/pathology_wsi/sample_slide.svs"
wsi_output_dir = "output/wsi_analysis/"

os.makedirs(wsi_output_dir, exist_ok=True)

# Check if WSI file exists
if os.path.exists(wsi_path):
    print(f"üî¨ Processing Whole Slide Image: {wsi_path}")
    
    try:
        # Open the WSI
        slide = openslide.OpenSlide(wsi_path)
        
        # Get slide information
        slide_info = {
            'dimensions': slide.dimensions,
            'level_count': slide.level_count,
            'level_downsamples': slide.level_downsamples,
            'vendor': slide.properties.get(openslide.PROPERTY_NAME_VENDOR, 'Unknown'),
            'magnification': slide.properties.get(openslide.PROPERTY_NAME_OBJECTIVE_POWER, 'Unknown')
        }
        
        print(f"\nüìä Slide Information:")
        for key, value in slide_info.items():
            print(f"  {key}: {value}")
        
        # Extract patches for analysis
        patch_size = 512
        patch_overlap = 128
        analysis_level = 2  # Use a mid-resolution level for analysis
        
        print(f"\nüîç Extracting patches (size: {patch_size}x{patch_size}, overlap: {patch_overlap})")
        
        # Calculate patch positions
        level_dimensions = slide.level_dimensions[analysis_level]
        step_size = patch_size - patch_overlap
        
        patches = []
        for y in range(0, level_dimensions[1] - patch_size + 1, step_size):
            for x in range(0, level_dimensions[0] - patch_size + 1, step_size):
                # Extract patch
                patch = slide.read_region(
                    (x * slide.level_downsamples[analysis_level], 
                     y * slide.level_downsamples[analysis_level]),
                    analysis_level,
                    (patch_size, patch_size)
                )
                patches.append({
                    'x': x, 'y': y,
                    'patch': patch,
                    'patch_rgb': patch.convert('RGB')
                })
        
        print(f"‚úÖ Extracted {len(patches)} patches")
        
        # Analyze first few patches with both methods
        print(f"\nüî¨ Analyzing patches with both methods...")
        
        sample_patches = patches[:5]  # Analyze first 5 patches
        wsi_results = []
        
        for i, patch_data in enumerate(sample_patches):
            # Save patch temporarily
            patch_path = os.path.join(wsi_output_dir, f'patch_{i:03d}.png')
            patch_data['patch_rgb'].save(patch_path)
            
            # Extract features using both methods
            cp_features = analyzer.extract_features(
                image_dir=wsi_output_dir,
                method='cellprofiler',
                config_name='nuclear_focused',
                file_pattern=f'patch_{i:03d}.png'
            )
            
            titan_features = analyzer.extract_features(
                image_dir=wsi_output_dir,
                method='titan',
                config_name='titan_pretrained',
                file_pattern=f'patch_{i:03d}.png'
            )
            
            wsi_results.append({
                'patch_id': i,
                'x': patch_data['x'],
                'y': patch_data['y'],
                'cp_features': cp_features,
                'titan_features': titan_features
            })
            
            # Clean up temporary patch file
            if os.path.exists(patch_path):
                os.remove(patch_path)
        
        print(f"‚úÖ WSI patch analysis completed for {len(wsi_results)} patches")
        
        # Close slide
        slide.close()
        
    except Exception as e:
        print(f"‚ùå Error processing WSI: {e}")
else:
    print(f"‚ö†Ô∏è  WSI file not found: {wsi_path}")
    print("This is a demonstration - replace with your actual SVS file path")
    
# Create dummy WSI results for demonstration
print("\nüé≠ Creating dummy WSI patch analysis for demonstration...")

dummy_wsi_results = [
    {'patch_id': 0, 'x': 0, 'y': 0, 'cp_nuclear_count': 45, 'titan_class': 'epithelial'},
    {'patch_id': 1, 'x': 384, 'y': 0, 'cp_nuclear_count': 32, 'titan_class': 'stromal'},
    {'patch_id': 2, 'x': 768, 'y': 0, 'cp_nuclear_count': 67, 'titan_class': 'tumor'},
    {'patch_id': 3, 'x': 0, 'y': 384, 'cp_nuclear_count': 28, 'titan_class': 'stromal'},
    {'patch_id': 4, 'x': 384, 'y': 384, 'cp_nuclear_count': 53, 'titan_class': 'tumor'}
]

wsi_df = pd.DataFrame(dummy_wsi_results)
print(f"üìä Dummy WSI analysis created: {len(wsi_df)} patches")
print("\nüëÄ Sample WSI patch analysis:")
display(wsi_df)

## üéØ Feature Fusion and Selection {#fusion}

In [None]:
# Advanced feature fusion techniques
if 'combined_df' in locals():
    print("üéØ Performing feature fusion and selection...")
    
    from sklearn.feature_selection import SelectKBest, f_classif, mutual_info_classif
    from sklearn.decomposition import PCA
    from sklearn.preprocessing import StandardScaler
    
    # Separate features
    feature_cols = [col for col in combined_df.columns if col != 'ImageID']
    X = combined_df[feature_cols]
    
    # 1. Statistical feature selection
    print("\nüìä Statistical Feature Selection:")
    
    # Remove features with low variance
    variance_threshold = 0.01
    low_variance_features = X.columns[X.var() < variance_threshold].tolist()
    print(f"  Low variance features (< {variance_threshold}): {len(low_variance_features)}")
    
    # Remove highly correlated features
    corr_matrix = X.corr().abs()
    upper_triangle = corr_matrix.where(np.triu(np.ones(corr_matrix.shape), k=1).astype(bool))
    high_corr_features = [col for col in upper_triangle.columns if any(upper_triangle[col] > 0.95)]
    print(f"  Highly correlated features (r > 0.95): {len(high_corr_features)}")
    
    # 2. Dimensionality reduction with PCA
    print("\nüîç PCA Dimensionality Reduction:")
    
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    
    # Determine optimal number of components (95% variance)
    pca_full = PCA()
    pca_full.fit(X_scaled)
    
    cumsum_variance = np.cumsum(pca_full.explained_variance_ratio_)
    n_components_95 = np.argmax(cumsum_variance >= 0.95) + 1
    
    print(f"  Original features: {len(feature_cols)}")
    print(f"  Components for 95% variance: {n_components_95}")
    print(f"  Variance reduction: {(1 - n_components_95/len(feature_cols))*100:.1f}%")
    
    # Apply PCA with optimal components
    pca = PCA(n_components=n_components_95)
    X_pca = pca.fit_transform(X_scaled)
    
    # Create DataFrame with PCA features
    pca_feature_names = [f'PCA_{i+1}' for i in range(n_components_95)]
    pca_df = pd.DataFrame(X_pca, columns=pca_feature_names)
    pca_df['ImageID'] = combined_df['ImageID']
    
    # 3. Feature importance analysis (if we had labels)
    print("\nüéØ Feature Importance Analysis:")
    print("  (Note: Would require class labels for supervised selection)")
    
    # Analyze PCA components
    print(f"\nüìà PCA Component Analysis (first 5 components):")
    for i in range(min(5, n_components_95)):
        explained_var = pca.explained_variance_ratio_[i]
        print(f"  PCA_{i+1}: {explained_var:.4f} ({explained_var*100:.2f}% variance)")
else:
    print("‚ö†Ô∏è  No combined features available for fusion analysis")
    
# Save fused features
if 'pca_df' in locals():
    fused_output = os.path.join(output_dir, 'fused_pathology_features.csv')
    pca_df.to_csv(fused_output, index=False)
    print(f"\nüíæ Fused features saved to: {fused_output}")

## üìä Comparative Analysis {#comparison}

In [None]:
# Create comprehensive comparative visualizations
if 'cp_df' in locals() and 'titan_df' in locals():
    print("üìä Creating comparative analysis visualizations...")
    
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    fig.suptitle('Pathology Feature Analysis Comparison', fontsize=16, fontweight='bold')
    
    # 1. Feature count comparison
    cp_feature_count = len([col for col in cp_df.columns if col != 'ImageID'])
    titan_feature_count = len([col for col in titan_df.columns if col != 'ImageID'])
    
    feature_counts = [cp_feature_count, titan_feature_count]
    feature_labels = ['CellProfiler', 'TITAN']
    
    axes[0, 0].bar(feature_labels, feature_counts, color=['skyblue', 'lightcoral'], alpha=0.7)
    axes[0, 0].set_title('Feature Count Comparison')
    axes[0, 0].set_ylabel('Number of Features')
    axes[0, 0].grid(True, alpha=0.3)
    
    # Add count labels on bars
    for i, count in enumerate(feature_counts):
        axes[0, 0].text(i, count + max(feature_counts)*0.01, str(count), 
                       ha='center', fontweight='bold')
    
    # 2. Feature distribution comparison
    cp_cols = [col for col in cp_df.columns if col != 'ImageID']
    titan_cols = [col for col in titan_df.columns if col != 'ImageID']
    
    if cp_cols and titan_cols:
        # Sample features for visualization
        cp_sample = cp_df[cp_cols[:5]].values.flatten()
        titan_sample = titan_df[titan_cols[:100]].values.flatten()  # Sample more TITAN features
        
        axes[0, 1].hist(cp_sample, bins=30, alpha=0.7, label='CellProfiler', 
                       color='skyblue', density=True)
        axes[0, 1].hist(titan_sample, bins=30, alpha=0.7, label='TITAN', 
                       color='lightcoral', density=True)
        axes[0, 1].set_title('Feature Value Distribution')
        axes[0, 1].set_xlabel('Feature Value')
    axes[0, 1].set_ylabel('Density')
    axes[0, 1].legend()
    axes[0, 1].grid(True, alpha=0.3)
    
    # 3. Correlation heatmap (subset)
    if 'combined_df' in locals():
        # Create correlation matrix with sample features
        sample_features = []
        sample_features.extend(cp_cols[:3])
        sample_features.extend(titan_cols[:7])  # Total 10 features
        
        if len(sample_features) >= 10:
            corr_subset = combined_df[sample_features].corr()
            
            # Create masks for different feature types
            cp_mask = np.zeros_like(corr_subset, dtype=bool)
            titan_mask = np.zeros_like(corr_subset, dtype=bool)
            
            for i, feat1 in enumerate(sample_features):
                for j, feat2 in enumerate(sample_features):
                    if feat1 in cp_cols and feat2 in cp_cols:
                        cp_mask[i, j] = True
                    elif feat1 in titan_cols and feat2 in titan_cols:
                        titan_mask[i, j] = True
            
            sns.heatmap(corr_subset, annot=True, cmap='coolwarm', center=0, 
                       square=True, fmt='.2f', ax=axes[0, 2], cbar_kws={"shrink": .6})
            axes[0, 2].set_title('Feature Correlation Matrix\n(First 10 Features)')
    
    # 4. PCA variance explained
    if 'pca' in locals():
        n_components_show = min(20, len(pca.explained_variance_ratio_))
        component_numbers = range(1, n_components_show + 1)
        
        axes[1, 0].bar(component_numbers, 
                       pca.explained_variance_ratio_[:n_components_show],
                       color='gold', alpha=0.7)
        axes[1, 0].set_title('PCA Explained Variance Ratio\n(First 20 Components)')
        axes[1, 0].set_xlabel('Principal Component')
        axes[1, 0].set_ylabel('Explained Variance Ratio')
        axes[1, 0].grid(True, alpha=0.3)
    
    # 5. Cumulative variance
    if 'pca' in locals():
        cumsum_var = np.cumsum(pca.explained_variance_ratio_[:n_components_show])
        
        axes[1, 1].plot(component_numbers, cumsum_var, 'o-', 
                       color='green', linewidth=2, markersize=6)
        axes[1, 1].axhline(y=0.95, color='red', linestyle='--', alpha=0.7, label='95% variance')
        axes[1, 1].axhline(y=0.90, color='orange', linestyle='--', alpha=0.7, label='90% variance')
        axes[1, 1].set_title('Cumulative Explained Variance')
        axes[1, 1].set_xlabel('Number of Components')
        axes[1, 1].set_ylabel('Cumulative Variance Ratio')
        axes[1, 1].legend()
        axes[1, 1].grid(True, alpha=0.3)
    
    # 6. WSI patch analysis (if available)
    if 'wsi_df' in locals():
        patch_classes = wsi_df['titan_class'].value_counts()
        
        axes[1, 2].pie(patch_classes.values, labels=patch_classes.index, 
                      autopct='%1.1f%%', colors=['lightgreen', 'lightyellow', 'lightpink'])
        axes[1, 2].set_title('WSI Patch Classification\n(TITAN Results)')
    
    plt.tight_layout()
    plt.show()
else:
    print("‚ö†Ô∏è  Insufficient data for comparative analysis")
    
# Create a summary comparison table
if 'cp_df' in locals() and 'titan_df' in locals():
    comparison_data = {
        'Method': ['CellProfiler', 'TITAN', 'Combined'],
        'Feature Count': [cp_feature_count, titan_feature_count, cp_feature_count + titan_feature_count],
        'Feature Type': ['Traditional Radiomics', 'Deep Learning', 'Hybrid'],
        'Interpretability': ['High', 'Low', 'Medium'],
        'Computational Cost': ['Low', 'High', 'High'],
        'Domain Specificity': ['High', 'Medium', 'High']
    }
    
    comparison_df = pd.DataFrame(comparison_data)
    
    print("\nüìã Method Comparison Summary:")
    display(comparison_df)
else:
    # Create dummy comparison table
    dummy_comparison = pd.DataFrame({
        'Method': ['CellProfiler', 'TITAN', 'Combined'],
        'Feature Count': [12, 2048, 2060],
        'Feature Type': ['Traditional Radiomics', 'Deep Learning', 'Hybrid'],
        'Interpretability': ['High', 'Low', 'Medium'],
        'Computational Cost': ['Low', 'High', 'High'],
        'Domain Specificity': ['High', 'Medium', 'High']
    })
    
    print("\nüìã Method Comparison Summary (Dummy):")
    display(dummy_comparison)

## üéØ Summary and Best Practices

### Key Takeaways:
1. **Dual-Mode Analysis**: CellProfiler provides interpretable features, TITAN provides powerful deep features
2. **Feature Complementarity**: Traditional and deep features capture different aspects of pathology
3. **Dimensionality Reduction**: Essential for high-dimensional deep features
4. **WSI Processing**: Enables comprehensive tissue-level analysis
5. **Feature Fusion**: Combines strengths of both approaches

### Method Selection Guidelines:
- **CellProfiler**: Best for interpretability, clinical applications, limited data
- **TITAN**: Best for large datasets, research, maximum performance
- **Combined**: Best for comprehensive analysis, research projects

### Performance Considerations:
- ‚ö° **CellProfiler**: Fast processing (~1-2 sec per image)
- ‚ö° **TITAN**: Slower processing (~5-10 sec per image, GPU recommended)
- ‚ö° **Combined**: Combined processing time
- üíæ **Memory**: TITAN requires more memory, especially for WSI

### Common Issues and Solutions:
- ‚ö†Ô∏è **Low image quality** ‚Üí Apply preprocessing and quality control
- ‚ö†Ô∏è **Memory issues** ‚Üí Use smaller patches or batch processing
- ‚ö†Ô∏è **Feature redundancy** ‚Üí Apply correlation filtering and PCA
- ‚ö†Ô∏è **Domain mismatch** ‚Üí Fine-tune models on specific tissue types

### Next Steps:
- üß™ Validate features with clinical outcomes
- üîó Combine with radiology features for multi-modal analysis
- üìä Build predictive models using extracted features
- üéØ Deploy in clinical workflow with proper validation