## 1. Environment Setup and Data Preparation (MinMax Approach, Approach 1)

In this section, we initialize the working environment by importing the necessary Julia packages (`Flux`, `JLD2`, etc.) and including our custom utility modules:
* `unit2-multilayer-perceptron.jl`: Contains functions for building the ANN and handling encodings.
* `unit4-metrics.jl`: Contains the evaluation metrics (Confusion Matrix, F1-Score, etc.).

### Data Preprocessing for Flux.jl
We load the **Approach 1 (MinMax Normalized)** dataset. A critical step involves reshaping the data to match **Flux.jl** requirements:

1.  **Inputs ($X$):** The raw data comes in `(Samples × Features)` format. We transpose it using `permutedims` to `(Features × Samples)` and convert it to `Float32` for computational efficiency.
2.  **Targets ($Y$):** Since this is a multi-class problem (Classes 0-4), we transform the integer labels into **One-Hot Encoded** vectors (e.g., class 2 becomes `[0, 0, 1, 0, 0]`).

In [29]:
using Flux
using JLD2, FileIO
using Statistics
using Random
using Printf

include("unit2-multilayer-perceptron.jl")  
include("unit4-metrics.jl")             
println("--- STARTING SINGLE EXECUTION (APPROACH 1) ---")

println("\n1. Loading and Preprocessing Data...")
data = load("data_checkpoints/approach_1_minmax.jld2")

# Extract matrices (Samples x Features)
x_train_raw = copy(data["x_train"])
y_train_raw = copy(data["y_train"])
x_val_raw   = copy(data["x_val"])
y_val_raw   = copy(data["y_val"])
x_test_raw  = copy(data["x_test"])
y_test_raw  = copy(data["y_test"])
# Flux needs (Features x Samples)
x_train = Float32.(permutedims(x_train_raw))
x_val   = Float32.(permutedims(x_val_raw))
x_test  = Float32.(permutedims(x_test_raw))

# 5 Classes
classes = [0, 1, 2, 3, 4]

y_train_encoded = oneHotEncoding(y_train_raw, classes)
y_val_encoded   = oneHotEncoding(y_val_raw, classes)
y_test_encoded  = oneHotEncoding(y_test_raw, classes)


y_train_flux = Float32.(permutedims(y_train_encoded))
y_val_flux   = Float32.(permutedims(y_val_encoded))

println("   > Input Features: $(size(x_train, 1))")
println("   > Output Classes: $(size(y_train_flux, 1))")


--- STARTING SINGLE EXECUTION (APPROACH 1) ---

1. Loading and Preprocessing Data...
   > Input Features: 31
   > Output Classes: 5


## 2. Experimental Loop: ANN Architecture Search (MinMax Approach, Approach 1)

In this section, we perform a systematic evaluation of the **10 defined ANN topologies** to identify the optimal architecture for the **MinMax Normalized dataset**.

**Methodology:**
For each topology in our list, the following steps are executed:
1.  **Initialization:** A fresh ANN model is built to ensure no weight contamination between iterations.
2.  **Training:** The model is trained for **1000 epochs** using the full training set. We measure the **computational time** (`elapsed_seconds`) to assess efficiency.
3.  **Evaluation:** The model is tested against the unseen **Test Set**. Predictions are converted to boolean values using a winner-takes-all approach (`classifyOutputs`).
4.  **Metrics Calculation:** We compute the full suite of metrics: Accuracy, F1-Score, Specificity, Precision, and Sensitivity.

**Selection Criteria:**
Following the project guidelines regarding **patient safety**, the results are sorted by **Sensitivity (Recall)**.
* **Justification:** In medical diagnosis, minimizing **False Negatives** (failing to detect a sick patient) is the priority. Therefore, the "Winner Topology" will be the one that maximizes the detection of the disease classes, even if it slightly compromises overall accuracy or training time.

In [37]:

experiment_results = []
n_inputs  = size(x_train, 1)
n_outputs = 5
epochs    = 1000 

println("\n--- STARTING LOOP (Goal: Maximize Recall/Sensitivity) ---")

for (i, topology) in enumerate(topologies_to_test)
    println("\n Testing Topology $i/10: $topology ...")
    
    start_time = time()
    
    # Build & Config
    model = buildClassANN(n_inputs, topology, n_outputs)
    loss(m, x, y) = Flux.crossentropy(m(x), y)
    opt_state = Flux.setup(Flux.Adam(0.01), model)
    
    # Train
    for epoch in 1:epochs
        Flux.train!(loss, model, [(x_train, y_train_flux)], opt_state)
    end
    
    end_time = time()
    elapsed_seconds = end_time - start_time
    
    # Evaluate
    raw_preds = model(x_test)
    preds_transposed = permutedims(raw_preds)
    y_pred_bool = classifyOutputs(preds_transposed)
    y_true_bool = y_test_encoded 
    
    # Calculate Metrics
    metrics = confusionMatrix(y_pred_bool, y_true_bool)
    
    # Extract Values
    acc      = metrics.accuracy * 100
    f1_mean  = metrics.f_score
    sens_mean = metrics.sensitivity
    spec_mean = metrics.specificity
    prec_mean = metrics.ppv
    
    println(" Done. Sens (Recall): $(round(sens_mean, digits=4)) | Time: $(round(elapsed_seconds, digits=2))s")
    
    push!(experiment_results, (topology, acc, f1_mean, sens_mean, spec_mean, prec_mean, elapsed_seconds))
end

println("\n\n========================================================================================")
println("     FINAL RESULTS (SORTED BY SENSITIVITY/RECALL)  Approach minmax")
println("========================================================================================")
@printf(" %-2s | %-14s | %-6s | %-6s | %-6s | %-8s | %-6s | %-6s\n", 
        "ID", "Topology", "Sens", "Acc(%)", "F1", "Time(s)", "Spec", "Prec")
println("----|----------------|--------|--------|--------|----------|--------|--------")

# Sort by sensitivity
sort!(experiment_results, by = x -> x[4], rev = true)

for (i, res) in enumerate(experiment_results)
    (topo, acc, f1, sens, spec, prec, time_sec) = res
    
    topo_str = string(topo)
    
    @printf(" %-2d | %-14s | %-6.4f | %-6.2f | %-6.4f | %-8.3f | %-6.4f | %-6.4f\n", 
            i, topo_str, sens, acc, f1, time_sec, spec, prec)
end
println("========================================================================================")

best_topo = experiment_results[1][1]
best_sens = experiment_results[1][4]

println("\n WINNER TOPOLOGY (Best Recall): $best_topo")
println("   With Sensitivity: $(round(best_sens, digits=4))")
println("   Justification: Maximizing detection of sick patients (Minimizing False Negatives).")


--- STARTING LOOP (Goal: Maximize Recall/Sensitivity) ---

 Testing Topology 1/10: [8] ...
 Done. Sens (Recall): 0.7258 | Time: 1.26s

 Testing Topology 2/10: [16] ...
 Done. Sens (Recall): 0.7823 | Time: 0.59s

 Testing Topology 3/10: [32] ...
 Done. Sens (Recall): 0.8306 | Time: 1.89s

 Testing Topology 4/10: [64] ...
 Done. Sens (Recall): 0.8629 | Time: 2.07s

 Testing Topology 5/10: [16, 8] ...
 Done. Sens (Recall): 0.8468 | Time: 1.66s

 Testing Topology 6/10: [32, 16] ...
 Done. Sens (Recall): 0.8548 | Time: 1.99s

 Testing Topology 7/10: [64, 32] ...
 Done. Sens (Recall): 0.8629 | Time: 3.21s

 Testing Topology 8/10: [16, 16] ...
 Done. Sens (Recall): 0.8548 | Time: 0.85s

 Testing Topology 9/10: [32, 32] ...
 Done. Sens (Recall): 0.8548 | Time: 2.99s

 Testing Topology 10/10: [8, 16] ...
 Done. Sens (Recall): 0.7742 | Time: 0.69s


     FINAL RESULTS (SORTED BY SENSITIVITY/RECALL)  Approach minmax
 ID | Topology       | Sens   | Acc(%) | F1     | Time(s)  | Spec   | Prec  
---

## 2. Data Loading and Preprocessing Pipeline

To ensure consistency across all experiments and avoid code redundancy, we implemented a reusable function `load_approach_data`. This function streamlines the ingestion of the three different data approaches (MinMax, PCA, and ICA) prepared in the previous stage.

**Key Preprocessing Steps:**
* **Data Ingestion:** Loads the `.jld2` checkpoints containing the train/validation/test splits.
* **Flux.jl Adaptation:** Deep Learning libraries like Flux require input matrices in a `(Features × Samples)` orientation. Since standard dataframes are stored as `(Samples × Features)`, we perform a matrix transposition using `permutedims`.
* **Target Encoding:** The problem is defined as a **multi-class classification (5 classes)**. The integer targets are converted into **One-Hot Encoded** vectors. For example, class `2` is transformed into `[0, 0, 1, 0, 0]`.
* **Optimization:** All data is cast to `Float32` to maximize training performance on the CPU/GPU.

In [45]:
function load_approach_data(filename::String)
    
    data = load(filename)

    x_train_raw = copy(data["x_train"])
    y_train_raw = copy(data["y_train"])
    x_val_raw   = copy(data["x_val"])
    y_val_raw   = copy(data["y_val"])
    x_test_raw  = copy(data["x_test"])
    y_test_raw  = copy(data["y_test"])

    x_train = Float32.(permutedims(x_train_raw))
    x_val   = Float32.(permutedims(x_val_raw))
    x_test  = Float32.(permutedims(x_test_raw))

    classes = [0, 1, 2, 3, 4]
    
    y_train_encoded = oneHotEncoding(y_train_raw, classes)
    y_val_encoded   = oneHotEncoding(y_val_raw, classes)
    y_test_encoded  = oneHotEncoding(y_test_raw, classes)
    
    y_train_flux = Float32.(permutedims(y_train_encoded))
    y_val_flux   = Float32.(permutedims(y_val_encoded))
    
    n_inputs = size(x_train, 1)
    println(" Loaded. Inputs (Features): $n_inputs | Samples: $(size(x_train, 2))")
    
    return (
        x_train=x_train, y_train_flux=y_train_flux, y_train_encoded=y_train_encoded,
        x_val=x_val,     y_val_flux=y_val_flux,     y_val_encoded=y_val_encoded,
        x_test=x_test,   y_test_encoded=y_test_encoded
    )
end


println("LOADING ALL DATASETS")

# Load Approach 1: MinMax
data_minmax = load_approach_data("data_checkpoints/approach_1_minmax.jld2")

# Load Approach 2: PCA
data_pca    = load_approach_data("data_checkpoints/approach_2_pca.jld2")

# Load Approach 3: ICA
data_ica    = load_approach_data("data_checkpoints/approach_3_ica.jld2")

println("\nREADY TO EXPERIMENT")

LOADING ALL DATASETS
 Loaded. Inputs (Features): 31 | Samples: 579
 Loaded. Inputs (Features): 17 | Samples: 579
 Loaded. Inputs (Features): 28 | Samples: 579

READY TO EXPERIMENT


## 4. Cross-Approach Experimental Evaluation: MinMax vs. PCA vs. ICA

In this section, we execute the comprehensive experimental pipeline to compare the three data processing strategies: **MinMax Normalization**, **Principal Component Analysis (PCA)**, and **Independent Component Analysis (ICA)**

### Objective
To determine the optimal combination of **Data Representation** and **ANN Topology**. We aim to answer: *Does dimensionality reduction (PCA/ICA) offer a significant reduction in training time without compromising the model's ability to detect sick patients?*

### Methodology
We utilize the helper function `run_topology_experiment` to automate the process for each dataset. For each approach, the system:
1.  **Adapts the Input Layer:** Automatically detects the number of features (31 for MinMax, ~17 for PCA, ~28 for ICA).
2.  **Grid Search:** Iterates through the 10 defined ANN architectures (varying depth and width).
3.  **Performance Benchmarking:** Measures **Training Time (seconds)** to evaluate computational efficiency.
4.  **Clinical Evaluation:** Calculates **Sensitivity (Recall)** on the Test Set as the primary ranking metric.

### Selection Criteria
Following the safety-critical nature of medical diagnosis, results are sorted by **Sensitivity**:
* **Primary Goal:** Maximize the detection of true positive cases (Minimize False Negatives).
* **Secondary Goal:** Minimize computational cost (Time) and maximize F1-Score.


In [46]:
using Flux
using JLD2, FileIO
using Statistics
using Random
using Printf

# Architectures to test
topologies_to_test = [
    [8], [16], [32], [64],          # 1 Capa Oculta
    [16, 8], [32, 16], [64, 32],    # 2 Capas (Embudo)
    [16, 16], [32, 32], [8, 16]     # 2 Capas (Otras formas)
]


function run_topology_experiment(dataset, approach_name, topologies)
    
    println("Experiment for : $approach_name")

    x_train = dataset.x_train
    y_train_flux = dataset.y_train_flux
    x_test = dataset.x_test
    y_test_encoded = dataset.y_test_encoded
    
    n_inputs  = size(x_train, 1) 
    n_outputs = 5
    epochs    = 1000
    
    println("   > Input Neurons detected: $n_inputs")
    
    results = []

    for (i, topology) in enumerate(topologies)
        println("\n  [$approach_name] Testing Topology $i/$(length(topologies)): $topology ...")
        
        start_time = time()
        
        # Build Model
        model = buildClassANN(n_inputs, topology, n_outputs)
        loss(m, x, y) = Flux.crossentropy(m(x), y)
        opt_state = Flux.setup(Flux.Adam(0.01), model)
        
        # Train 
        for epoch in 1:epochs
            Flux.train!(loss, model, [(x_train, y_train_flux)], opt_state)
        end
        
        end_time = time()
        elapsed = end_time - start_time
        
        # Evaluate
        raw_preds = model(x_test)
        preds_transposed = permutedims(raw_preds)
        y_pred_bool = classifyOutputs(preds_transposed)
        
        # Metrics
        metrics = confusionMatrix(y_pred_bool, y_test_encoded)
        
        acc  = metrics.accuracy * 100
        f1   = metrics.f_score
        sens = metrics.sensitivity
        spec = metrics.specificity
        prec = metrics.ppv
        
        println("Done. Sens: $(round(sens, digits=4)) | F1: $(round(f1, digits=3)) | Time: $(round(elapsed, digits=2))s")
        
        push!(results, (topology, acc, f1, sens, spec, prec, elapsed))
    end
    
    # D. Print Summary Table for this Approach
    println("\n   ========================================================================================")
    println("       RESULTS FOR $approach_name (SORTED BY SENSITIVITY)")
    println("   ========================================================================================")
    @printf("    %-2s | %-14s | %-6s | %-6s | %-6s | %-8s | %-6s | %-6s\n", 
            "ID", "Topology", "Sens", "Acc(%)", "F1", "Time(s)", "Spec", "Prec")
    println("   ----|----------------|--------|--------|--------|----------|--------|--------")

    # Sort by Sensitivity 
    sort!(results, by = x -> x[4], rev = true)

    for (i, res) in enumerate(results)
        (topo, acc, f1, sens, spec, prec, time_sec) = res
        topo_str = string(topo)
        @printf("    %-2d | %-14s | %-6.4f | %-6.2f | %-6.4f | %-8.3f | %-6.4f | %-6.4f\n", 
                i, topo_str, sens, acc, f1, time_sec, spec, prec)
    end
    println("   ========================================================================================")
    
    return results[1] 
end


# Run MinMax
best_minmax = run_topology_experiment(data_minmax, "APPROACH 1: MINMAX", topologies_to_test)

# Run PCA
best_pca = run_topology_experiment(data_pca, "APPROACH 2: PCA", topologies_to_test)

# Run ICA
best_ica = run_topology_experiment(data_ica, "APPROACH 3: ICA", topologies_to_test)

# Compare
println("\n\n****************************************************************")
println("      COMPARISON OF WINNERS PER APPROACH")
println("****************************************************************")
println(" Approach | Best Topology  | Sensib | Acc(%) | F1-Score | Time(s)")
println("----------|----------------|--------|--------|----------|---------")

approaches = [("MinMax", best_minmax), ("PCA", best_pca), ("ICA", best_ica)]

for (name, res) in approaches
    (topo, acc, f1, sens, spec, prec, time_sec) = res
    @printf(" %-8s | %-14s | %.4f | %-6.2f | %-8.4f | %.3f\n", 
            name, string(topo), sens, acc, f1, time_sec)
end

Experiment for : APPROACH 1: MINMAX
   > Input Neurons detected: 31

  [APPROACH 1: MINMAX] Testing Topology 1/10: [8] ...
Done. Sens: 0.7097 | F1: 0.699 | Time: 1.3s

  [APPROACH 1: MINMAX] Testing Topology 2/10: [16] ...
Done. Sens: 0.8306 | F1: 0.825 | Time: 0.57s

  [APPROACH 1: MINMAX] Testing Topology 3/10: [32] ...
Done. Sens: 0.8387 | F1: 0.834 | Time: 2.1s

  [APPROACH 1: MINMAX] Testing Topology 4/10: [64] ...
Done. Sens: 0.871 | F1: 0.87 | Time: 2.1s

  [APPROACH 1: MINMAX] Testing Topology 5/10: [16, 8] ...
Done. Sens: 0.871 | F1: 0.87 | Time: 1.65s

  [APPROACH 1: MINMAX] Testing Topology 6/10: [32, 16] ...
Done. Sens: 0.879 | F1: 0.876 | Time: 2.34s

  [APPROACH 1: MINMAX] Testing Topology 7/10: [64, 32] ...
Done. Sens: 0.8548 | F1: 0.852 | Time: 3.83s

  [APPROACH 1: MINMAX] Testing Topology 8/10: [16, 16] ...
Done. Sens: 0.8548 | F1: 0.851 | Time: 0.85s

  [APPROACH 1: MINMAX] Testing Topology 9/10: [32, 32] ...
Done. Sens: 0.8548 | F1: 0.856 | Time: 3.0s

  [APPROACH 1