# EcoFair: Domain-Adaptive Green AI for Skin Cancer Detection

## Abstract

EcoFair introduces a novel **Green AI** pipeline for skin cancer detection that adapts to domain shifts while maintaining diagnostic accuracy and energy efficiency. The system combines three key innovations:

1. **VFL Architecture**: Split computing paradigm using lightweight (Lite) and heavy (Heavy) models, enabling efficient edge deployment.

2. **Neurosymbolic Scoring**: Embedding clinical domain knowledge (sun exposure risk based on body localization) into the model's decision-making process.

3. **Resource-Aware Routing**: Adaptive switching between Safety-First optimization (for stable domains) and Budget-Constrained routing (for high-entropy domains) based on observed domain shift.

## Setup & Installation

In [None]:
# Clone repository (if not already cloned)
!git clone https://github.com/mociatto/EcoFair.git

import sys
import os

# Add the cloned repo to the Python path
if os.path.exists('./EcoFair'):
    sys.path.append('./EcoFair')
elif os.path.exists('./src'):
    # If running from project root
    sys.path.append('.')

# Import EcoFair modules
from src import config, utils, data_loader, models, training, features, routing, fairness, visualization
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score

# Set reproducibility
utils.set_seed(config.RANDOM_STATE)

print(f"EcoFair v{config.VERSION} loaded successfully.")
print(f"Using models: {config.SELECTED_LITE_MODEL} (Lite) and {config.SELECTED_HEAVY_MODEL} (Heavy)")

---

# PART 1: The Benchmark (HAM10000)

We first establish a baseline on the source domain (**Dermoscopy**). The HAM10000 dataset provides high-quality dermoscopic images with consistent imaging conditions.

## 1.1 Data Loading & Feature Engineering

In [None]:
# Load and align HAM10000 data
print("Loading HAM10000 data...")
X_heavy, X_lite, meta_ham = data_loader.load_and_align_ham()
print(f"Loaded {len(meta_ham)} samples")
print(f"Heavy features shape: {X_heavy.shape}")
print(f"Lite features shape: {X_lite.shape}")

In [None]:
# Prepare tabular features with Neurosymbolic Risk Scoring
print("\nPreparing tabular features...")
X_tab, scaler, sex_encoder, loc_encoder, risk_scaler = features.prepare_tabular_features(meta_ham)
print(f"Tabular features shape: {X_tab.shape}")

# Prepare labels
y_ham, dx_to_idx = features.prepare_labels(meta_ham)
print(f"Labels shape: {y_ham.shape}")

## 1.2 Model Training

In [None]:
# Split data using stratified group K-fold
y_labels_ham = np.argmax(y_ham, axis=1)
splits = data_loader.get_stratified_split(meta_ham, y_labels_ham, n_splits=5)
train_idx, test_idx = list(splits)[0]

# Further split train into train/val
meta_train = meta_ham.iloc[train_idx].reset_index(drop=True)
y_train_labels = y_labels_ham[train_idx]
splits_val = data_loader.get_stratified_split(meta_train, y_train_labels, n_splits=5)
train_idx_final, val_idx = list(splits_val)[0]

# Get absolute indices
train_idx_abs = train_idx[train_idx_final]
val_idx_abs = train_idx[val_idx]

# Split features
X_lite_train = X_lite[train_idx_abs]
X_lite_val = X_lite[val_idx_abs]
X_lite_test = X_lite[test_idx]

X_heavy_train = X_heavy[train_idx_abs]
X_heavy_val = X_heavy[val_idx_abs]
X_heavy_test = X_heavy[test_idx]

X_tab_train = X_tab[train_idx_abs]
X_tab_val = X_tab[val_idx_abs]
X_tab_test = X_tab[test_idx]

y_train = y_ham[train_idx_abs]
y_val = y_ham[val_idx_abs]
y_test = y_ham[test_idx]

meta_test = meta_ham.iloc[test_idx].reset_index(drop=True)
meta_val = meta_train.iloc[val_idx].reset_index(drop=True)

print(f"Train: {len(y_train)}, Val: {len(y_val)}, Test: {len(y_test)}")

In [None]:
# Build models
print("Building VFL models...")
lite_adapter = models.build_image_adapter(feature_dim=X_lite.shape[1], embedding_dim=128)
heavy_adapter = models.build_image_adapter(feature_dim=X_heavy.shape[1], embedding_dim=128)
tab_client = models.build_tabular_client(input_dim=X_tab.shape[1], embedding_dim=128)
server_head = models.build_server_head(input_dim=256, num_classes=len(config.CLASS_NAMES))

lite_model = models.build_vfl_model(lite_adapter, tab_client, server_head)
heavy_model = models.build_vfl_model(heavy_adapter, tab_client, server_head)

print("Models built successfully.")

In [None]:
# Get class weights
class_weight_dict = training.get_class_weights(y_train)
print("\nClass weights:")
for i, class_name in enumerate(config.CLASS_NAMES):
    print(f"  {class_name}: {class_weight_dict[i]:.4f}")

In [None]:
# Train Lite Model
print("\nTraining Lite Model...")
lite_history = training.compile_and_train(
    lite_model, X_lite_train, X_tab_train, y_train,
    X_lite_val, X_tab_val, y_val,
    class_weight=class_weight_dict
)
print("Lite model training complete.")

In [None]:
# Train Heavy Model
print("\nTraining Heavy Model...")
heavy_history = training.compile_and_train(
    heavy_model, X_heavy_train, X_tab_train, y_train,
    X_heavy_val, X_tab_val, y_val,
    class_weight=class_weight_dict
)
print("Heavy model training complete.")

## 1.3 Safety-First Optimization & Routing

In [None]:
# Generate predictions on validation set
print("Generating predictions...")
lite_preds_val = lite_model.predict([X_lite_val, X_tab_val], batch_size=config.BATCH_SIZE, verbose=0)
heavy_preds_val = heavy_model.predict([X_heavy_val, X_tab_val], batch_size=config.BATCH_SIZE, verbose=0)

y_true_val = np.argmax(y_val, axis=1)

# Calculate entropy and safe-danger gap
entropy_val = routing.calculate_entropy(lite_preds_val)
safe_indices = [config.CLASS_NAMES.index(c) for c in config.SAFE_CLASSES]
danger_indices = [config.CLASS_NAMES.index(c) for c in config.DANGEROUS_CLASSES]
prob_safe_val = lite_preds_val[:, safe_indices].sum(axis=1)
prob_danger_val = lite_preds_val[:, danger_indices].sum(axis=1)
safe_danger_gap_val = prob_safe_val - prob_danger_val

# Calculate baseline accuracy
heavy_baseline_acc = accuracy_score(y_true_val, np.argmax(heavy_preds_val, axis=1))
print(f"Heavy baseline accuracy: {heavy_baseline_acc:.4f}")

In [None]:
# Optimize thresholds using SafetyFirstOptimizer
print("\nOptimizing routing thresholds...")
optimizer = routing.SafetyFirstOptimizer(
    lite_preds_val, heavy_preds_val, y_true_val,
    entropy_val, safe_danger_gap_val, heavy_baseline_acc
)

optimal_config, all_results = optimizer.optimize()

print(f"\nOptimal Configuration:")
print(f"  Entropy Threshold: {optimal_config['entropy_t']:.2f}")
print(f"  Gap Threshold: {optimal_config['gap_t']:.2f}")
print(f"  Heavy Weight: {optimal_config['heavy_weight']:.2f}")
print(f"  Accuracy: {optimal_config['accuracy']:.4f}")
print(f"  Intervention Rate: {optimal_config['intervention_rate']:.2f}%")

In [None]:
# Split PAD data for training and testing
print("\nSplitting PAD data...")
y_labels_pad = np.argmax(y_pad, axis=1)
splits_pad = data_loader.get_stratified_split(meta_pad, y_labels_pad, n_splits=5)
train_idx_pad, test_idx_pad = list(splits_pad)[0]

# Further split train into train/val
meta_train_pad = meta_pad.iloc[train_idx_pad].reset_index(drop=True)
y_train_labels_pad = y_labels_pad[train_idx_pad]
splits_val_pad = data_loader.get_stratified_split(meta_train_pad, y_train_labels_pad, n_splits=5)
train_idx_final_pad, val_idx_pad = list(splits_val_pad)[0]

# Get absolute indices
train_idx_abs_pad = train_idx_pad[train_idx_final_pad]
val_idx_abs_pad = train_idx_pad[val_idx_pad]

# Split features
X_lite_train_pad = X_lite_pad[train_idx_abs_pad]
X_lite_val_pad = X_lite_pad[val_idx_abs_pad]
X_lite_test_pad = X_lite_pad[test_idx_pad]

X_heavy_train_pad = X_heavy_pad[train_idx_abs_pad]
X_heavy_val_pad = X_heavy_pad[val_idx_abs_pad]
X_heavy_test_pad = X_heavy_pad[test_idx_pad]

X_tab_train_pad = X_tab_pad[train_idx_abs_pad]
X_tab_val_pad = X_tab_pad[val_idx_abs_pad]
X_tab_test_pad = X_tab_pad[test_idx_pad]

y_train_pad = y_pad[train_idx_abs_pad]
y_val_pad = y_pad[val_idx_abs_pad]
y_test_pad = y_pad[test_idx_pad]

meta_test_pad = meta_pad.iloc[test_idx_pad].reset_index(drop=True)
meta_val_pad = meta_train_pad.iloc[val_idx_pad].reset_index(drop=True)

print(f"Train: {len(y_train_pad)}, Val: {len(y_val_pad)}, Test: {len(y_test_pad)}")

In [None]:
# Safety check: Ensure models are builtif 'lite_model_pad' not in globals() or 'heavy_model_pad' not in globals():    raise NameError("PAD models not found! Please execute the model building cell (## 2.2) first.")# Train PAD models# Note: For demo purposes, we use fewer epochs. In production, use config.EPOCHS (30)print("\nTraining PAD Lite Model...")class_weight_dict_pad = training.get_class_weights(y_train_pad, class_names=PAD_CLASS_NAMES)lite_history_pad = training.compile_and_train(    lite_model_pad, X_lite_train_pad, X_tab_train_pad, y_train_pad,    X_lite_val_pad, X_tab_val_pad, y_val_pad,    class_weight=class_weight_dict_pad)print("PAD Lite model training complete.")print("\nTraining PAD Heavy Model...")heavy_history_pad = training.compile_and_train(    heavy_model_pad, X_heavy_train_pad, X_tab_train_pad, y_train_pad,    X_heavy_val_pad, X_tab_val_pad, y_val_pad,    class_weight=class_weight_dict_pad)print("PAD Heavy model training complete.")

In [None]:
# Apply optimized routing on test set
print("\nApplying routing on test set...")
lite_preds_test = lite_model.predict([X_lite_test, X_tab_test], batch_size=config.BATCH_SIZE, verbose=0)
heavy_preds_test = heavy_model.predict([X_heavy_test, X_tab_test], batch_size=config.BATCH_SIZE, verbose=0)

final_preds_ham, route_mask_ham = routing.apply_threshold_routing(
    lite_preds_test, heavy_preds_test,
    entropy_threshold=optimal_config['entropy_t'],
    gap_threshold=optimal_config['gap_t'],
    heavy_weight=optimal_config['heavy_weight']
)

y_true_test = np.argmax(y_test, axis=1)
y_pred_ham = np.argmax(final_preds_ham, axis=1)

acc_ham = accuracy_score(y_true_test, y_pred_ham)
acc_lite = accuracy_score(y_true_test, np.argmax(lite_preds_test, axis=1))
acc_heavy = accuracy_score(y_true_test, np.argmax(heavy_preds_test, axis=1))

print(f"\nTest Set Results:")
print(f"  Lite Accuracy: {acc_lite:.4f}")
print(f"  Heavy Accuracy: {acc_heavy:.4f}")
print(f"  EcoFair Accuracy: {acc_ham:.4f}")
print(f"  Routing Rate: {route_mask_ham.sum() / len(route_mask_ham) * 100:.2f}%")

## 1.4 Visualization

In [None]:
# Confusion Matrix Comparison
fig_cm = visualization.plot_confusion_matrix_comparison(
    y_true_test, lite_preds_test, heavy_preds_test, final_preds_ham
)
plt.show()

In [None]:
# Value-Added Analysis
fig_va = visualization.plot_value_added_bars(
    y_true_test, lite_preds_test, heavy_preds_test, final_preds_ham,
    route_mask=route_mask_ham
)
plt.show()

---

# PART 2: Domain Generalization (PAD-UFES-20)

We now test robustness on the target domain (**Clinical Images**), where significant domain shift occurs. Clinical images have different lighting, angles, and backgrounds compared to dermoscopic images.

## 2.1 Data Loading & Feature Engineering

In [None]:
# Load PAD-UFES-20 data
# Note: Update path to your PAD metadata location
PAD_METADATA_PATH = '/kaggle/input/skin-cancer/metadata.csv'  # Update this path

print("Loading PAD-UFES-20 data...")
X_heavy_pad, X_lite_pad, meta_pad = data_loader.load_and_align_pad(PAD_METADATA_PATH)
print(f"Loaded {len(meta_pad)} samples")
print(f"Heavy features shape: {X_heavy_pad.shape}")
print(f"Lite features shape: {X_lite_pad.shape}")

In [None]:
# Prepare tabular features (automatically handles column mapping)
print("\nPreparing tabular features...")
X_tab_pad, scaler_pad, sex_encoder_pad, loc_encoder_pad, risk_scaler_pad = features.prepare_tabular_features(meta_pad)
print(f"Tabular features shape: {X_tab_pad.shape}")

# Prepare labels (using PAD class names)
# Note: Update PAD_CLASS_NAMES in config if needed
PAD_CLASS_NAMES = ['bcc', 'scc', 'mel', 'ack', 'nev', 'sek']  # PAD-specific classes
y_pad, dx_to_idx_pad = features.prepare_labels(meta_pad, class_names=PAD_CLASS_NAMES)
print(f"Labels shape: {y_pad.shape}")

## 2.2 Budget-Constrained Routing

**Key Insight**: Due to high entropy in clinical images, standard threshold-based routing fails. We switch to **Budget-Constrained Routing** (Top-35% Uncertainty), which guarantees a fixed energy budget while routing the most uncertain samples to the heavy model.

## 2.2 Model Building & Training

**Important**: PAD-UFES-20 requires separate models due to different tabular feature dimensions and class structure. We build and train PAD-specific models here.

In [None]:
# Build PAD-specific models (required due to different tabular feature dimensions)# PAD has different number of localization categories, so tabular input dimension differsprint("Building PAD-specific models with correct dimensions...")# Build PAD models with correct tabular input dimensionlite_adapter_pad = models.build_image_adapter(feature_dim=X_lite_pad.shape[1], embedding_dim=128)heavy_adapter_pad = models.build_image_adapter(feature_dim=X_heavy_pad.shape[1], embedding_dim=128)tab_client_pad = models.build_tabular_client(input_dim=X_tab_pad.shape[1], embedding_dim=128)server_head_pad = models.build_server_head(input_dim=256, num_classes=len(PAD_CLASS_NAMES))lite_model_pad = models.build_vfl_model(lite_adapter_pad, tab_client_pad, server_head_pad)heavy_model_pad = models.build_vfl_model(heavy_adapter_pad, tab_client_pad, server_head_pad)print("PAD models built successfully.")print(f"Tabular input dimension: {X_tab_pad.shape[1]} (PAD-specific)")print(f"Number of classes: {len(PAD_CLASS_NAMES)} (PAD-specific)")

In [None]:
# Split PAD data for training and testing
print("\nSplitting PAD data...")
y_labels_pad = np.argmax(y_pad, axis=1)
splits_pad = data_loader.get_stratified_split(meta_pad, y_labels_pad, n_splits=5)
train_idx_pad, test_idx_pad = list(splits_pad)[0]

# Further split train into train/val
meta_train_pad = meta_pad.iloc[train_idx_pad].reset_index(drop=True)
y_train_labels_pad = y_labels_pad[train_idx_pad]
splits_val_pad = data_loader.get_stratified_split(meta_train_pad, y_train_labels_pad, n_splits=5)
train_idx_final_pad, val_idx_pad = list(splits_val_pad)[0]

# Get absolute indices
train_idx_abs_pad = train_idx_pad[train_idx_final_pad]
val_idx_abs_pad = train_idx_pad[val_idx_pad]

# Split features
X_lite_train_pad = X_lite_pad[train_idx_abs_pad]
X_lite_val_pad = X_lite_pad[val_idx_abs_pad]
X_lite_test_pad = X_lite_pad[test_idx_pad]

X_heavy_train_pad = X_heavy_pad[train_idx_abs_pad]
X_heavy_val_pad = X_heavy_pad[val_idx_abs_pad]
X_heavy_test_pad = X_heavy_pad[test_idx_pad]

X_tab_train_pad = X_tab_pad[train_idx_abs_pad]
X_tab_val_pad = X_tab_pad[val_idx_abs_pad]
X_tab_test_pad = X_tab_pad[test_idx_pad]

y_train_pad = y_pad[train_idx_abs_pad]
y_val_pad = y_pad[val_idx_abs_pad]
y_test_pad = y_pad[test_idx_pad]

meta_test_pad = meta_pad.iloc[test_idx_pad].reset_index(drop=True)
meta_val_pad = meta_train_pad.iloc[val_idx_pad].reset_index(drop=True)

print(f"Train: {len(y_train_pad)}, Val: {len(y_val_pad)}, Test: {len(y_test_pad)}")

In [None]:
# Train PAD models
# Note: For demo purposes, we use fewer epochs. In production, use config.EPOCHS (30)
print("\nTraining PAD Lite Model...")
class_weight_dict_pad = training.get_class_weights(y_train_pad, class_names=PAD_CLASS_NAMES)

lite_history_pad = training.compile_and_train(
    lite_model_pad, X_lite_train_pad, X_tab_train_pad, y_train_pad,
    X_lite_val_pad, X_tab_val_pad, y_val_pad,
    class_weight=class_weight_dict_pad
)
print("PAD Lite model training complete.")

print("\nTraining PAD Heavy Model...")
heavy_history_pad = training.compile_and_train(
    heavy_model_pad, X_heavy_train_pad, X_tab_train_pad, y_train_pad,
    X_heavy_val_pad, X_tab_val_pad, y_val_pad,
    class_weight=class_weight_dict_pad
)
print("PAD Heavy model training complete.")

In [None]:
# Generate predictions using PAD-specific models
print("\nGenerating predictions on PAD test set...")
lite_preds_pad = lite_model_pad.predict([X_lite_test_pad, X_tab_test_pad], batch_size=config.BATCH_SIZE, verbose=0)
heavy_preds_pad = heavy_model_pad.predict([X_heavy_test_pad, X_tab_test_pad], batch_size=config.BATCH_SIZE, verbose=0)

print(f"Test set size: {len(y_test_pad)}")

In [None]:
# Apply Budget-Constrained Routing (35% budget)
print("\nApplying Budget-Constrained Routing (35% budget)...")

# Use PAD class names for routing
PAD_SAFE_CLASSES = ['nev', 'sek']
PAD_DANGEROUS_CLASSES = ['bcc', 'scc', 'mel', 'ack']

final_preds_pad, route_mask_pad, confusion_score_pad = routing.apply_budget_routing(
    lite_preds_pad, heavy_preds_pad,
    budget=0.35,
    heavy_weight=0.5,
    class_names=PAD_CLASS_NAMES,
    safe_classes=PAD_SAFE_CLASSES,
    danger_classes=PAD_DANGEROUS_CLASSES
)

y_true_pad = np.argmax(y_test_pad, axis=1)
y_pred_pad = np.argmax(final_preds_pad, axis=1)

acc_pad = accuracy_score(y_true_pad, y_pred_pad)
acc_lite_pad = accuracy_score(y_true_pad, np.argmax(lite_preds_pad, axis=1))
acc_heavy_pad = accuracy_score(y_true_pad, np.argmax(heavy_preds_pad, axis=1))

print(f"\nPAD Test Set Results:")
print(f"  Lite Accuracy: {acc_lite_pad:.4f}")
print(f"  Heavy Accuracy: {acc_heavy_pad:.4f}")
print(f"  EcoFair Accuracy: {acc_pad:.4f}")
print(f"  Routing Rate: {route_mask_pad.sum() / len(route_mask_pad) * 100:.2f}% (Fixed at 35%)")

## 2.3 Energy Efficiency Analysis

In [None]:
# Load energy statistics
joules_per_lite = utils.load_energy_stats(config.SELECTED_LITE_MODEL, is_heavy=False)
joules_per_heavy = utils.load_energy_stats(config.SELECTED_HEAVY_MODEL, is_heavy=True)

# Use defaults if not found
if joules_per_lite is None:
    joules_per_lite = 1.0
if joules_per_heavy is None:
    joules_per_heavy = 2.5

print(f"Energy per sample:")
print(f"  Lite: {joules_per_lite:.6f} J")
print(f"  Heavy: {joules_per_heavy:.6f} J")

# Calculate routing rate
routing_rate_pad = route_mask_pad.sum() / len(route_mask_pad)

# Plot battery decay
fig_battery = visualization.plot_battery_decay(
    lite_joules=joules_per_lite,
    heavy_joules=joules_per_heavy,
    routing_rate=routing_rate_pad,
    capacity_joules=10000
)
plt.show()

---

# PART 3: Fairness Audit

We empirically audit the model's fairness across **Age** and **Sex** subgroups to ensure equitable performance across demographic groups.

## 3.1 HAM10000 Fairness Analysis

In [None]:
# Generate fairness report for HAM10000
print("HAM10000 Fairness Report:")
ham_fairness = fairness.generate_fairness_report(
    y_true_test, y_pred_ham, meta_test
)
print("\n")
display(ham_fairness)

## 3.2 PAD-UFES-20 Fairness Analysis

In [None]:
# Generate fairness report for PAD-UFES-20
print("PAD-UFES-20 Fairness Report:")
pad_fairness = fairness.generate_fairness_report(
    y_true_pad, y_pred_pad, meta_test_pad,
    class_names=PAD_CLASS_NAMES
)
print("\n")
display(pad_fairness)

---

# Summary

EcoFair demonstrates:

1. **Effective Domain Adaptation**: Maintains performance across dermoscopic (HAM10000) and clinical (PAD-UFES-20) domains.

2. **Energy Efficiency**: Budget-constrained routing enables significant energy savings while preserving diagnostic accuracy.

3. **Fairness**: Systematic evaluation across demographic subgroups ensures equitable performance.

The modular architecture enables easy extension to new domains and routing strategies.