# Support Vector Machines (SVM)
## Heart Disease Detection

In this notebook, we explore the use of **SVM (Support Vector Machines)** to classify patients with heart disease. The main objective is to find the model that maximizes **Sensitivity (Recall)**, as in medical diagnosis, it is critical to minimize **False Negatives** (failing to detect a sick patient).

We will evaluate performance across three preprocessed datasets:
1.  **MinMax:** Standard normalization.
2.  **PCA:** Linear dimensionality reduction.
3.  **ICA:** Independent Component Analysis.
4.  **MinMax + Cross-Validation:** Robust validation for the baseline.
5.  **PCA + Cross-Validation:** Robust validation for the reduced set.

**Optimization:**
We employ **Class Weighting** to handle class imbalance, forcing the SVM to penalize errors on sick patients more heavily.

In [6]:
#load libraries and previous functions:
using Downloads
using DelimitedFiles
using Plots
using MLJ
using MLJModels
using MLJMultivariateStatsInterface
using MLJLinearModels
using MLJDecisionTreeInterface
using MLJNaiveBayesInterface
using MLJLIBSVMInterface
using Statistics
using Flux
using Flux: Losses
using Printf
using Random
using NearestNeighborModels
using CSV
using DataFrames
using DataFramesMeta
import MultivariateStats
using LIBSVM
using JLD2, FileIO
using CategoricalArrays

include("unit2-multilayer-perceptron.jl")
include("unit3-overfitting.jl")
include("unit4-metrics.jl")
include("unit5-crossvalidation.jl")
include("unit6-modelcrossvalidation.jl")
include("SVM_final_utils.jl")

universalCrossValidation_PCA

## 2. Hyperparameter Grid Search
SVMs are powerful but highly sensitive to their hyperparameters. We perform a search over different **Kernels** to find the best decision boundary geometry.

### Parameters Tested:
* **Kernels:**
    * **Linear:** For simple, linearly separable data.
    * **RBF (Radial Basis Function):** The standard for non-linear data. Maps samples to infinite dimensions.
    * **Polynomial:** Captures specific degree interactions between features.
    * **Sigmoid:** Mimics neural network activation patterns.
    
* **C (Cost/Regularization):**
    * Controls the trade-off between a smooth boundary and classifying training points correctly.
    * **High C:** Strict (Hard Margin), risks overfitting.
    * **Low C:** Tolerant (Soft Margin), generalizes better but might miss edge cases.

* **Gamma ($\gamma$):** (For RBF/Poly/Sigmoid)
    * Defines the "reach" of a single training example.
    * **High $\gamma$:** Only close points affect the boundary (complex islands).
    * **Low $\gamma$:** Distant points have influence (smooth curvature).

In [7]:
# Define SVM configurations (Hyperparameters) to test
# Format: (Name, Kernel, Cost, Gamma, Degree)
svm_configs = [
    ("RBF (C=1000000, G=0.03)",           Kernel.RadialBasis, 1000000.0, 0.03, 0),
    ("RBF (C=5000, G=0.07)",              Kernel.RadialBasis, 5000.0,    0.07, 0),
    ("RBF (C=100000.0, G=0.5)",           Kernel.RadialBasis, 100000.0,  0.5,  0),
    ("Linear (C=1000)",                   Kernel.Linear,      1000.0,    0.0,  0),
    ("Poly (Deg=2, C=10000.0, Coef=1.0)", Kernel.Polynomial,  10000.0,   1.0,  2),
    ("Poly (Deg=5, C=10000.0, Coef=1.0)", Kernel.Polynomial,  10000.0,   1.0,  5),
    ("Sigmoid (C=100.0, G=0.01)",         Kernel.Linear,      100.0,     0.01, 0),
    ("Sigmoid (C=1000.0, G=0.01)",        Kernel.RadialBasis, 1000.0,    0.01, 0),
]

8-element Vector{Tuple{String, LIBSVM.Kernel.KERNEL, Float64, Float64, Int64}}:
 ("RBF (C=1000000, G=0.03)", LIBSVM.Kernel.RadialBasis, 1.0e6, 0.03, 0)
 ("RBF (C=5000, G=0.07)", LIBSVM.Kernel.RadialBasis, 5000.0, 0.07, 0)
 ("RBF (C=100000.0, G=0.5)", LIBSVM.Kernel.RadialBasis, 100000.0, 0.5, 0)
 ("Linear (C=1000)", LIBSVM.Kernel.Linear, 1000.0, 0.0, 0)
 ("Poly (Deg=2, C=10000.0, Coef=1.0)", LIBSVM.Kernel.Polynomial, 10000.0, 1.0, 2)
 ("Poly (Deg=5, C=10000.0, Coef=1.0)", LIBSVM.Kernel.Polynomial, 10000.0, 1.0, 5)
 ("Sigmoid (C=100.0, G=0.01)", LIBSVM.Kernel.Linear, 100.0, 0.01, 0)
 ("Sigmoid (C=1000.0, G=0.01)", LIBSVM.Kernel.RadialBasis, 1000.0, 0.01, 0)

# MULTICALSE

## Approach 1: MinMax Normalization (Baseline)

**Objective:** Establish a baseline performance using the full set of features.

**Methodology:**
1.  **Preprocessing:** We apply **MinMax Normalization** to scale all numerical features to the range $[0, 1]$. This is crucial for SVMs, as they are distance-based algorithms and sensitive to features with different scales.
2.  **Validation:** We use a simple **Holdout** strategy (Train/Test split) to evaluate the model.
3.  **Training:** We train SVMs with various kernels and class weights to handle imbalance.

In [8]:
# -----------------------------------------------------------------
# (Aproach 1) minmax
# -----------------------------------------------------------------
data_path = "heart_disease_uci.csv"

data, num_col, cat_col, target_col = load_and_clean_data(data_path)

println("\n Init aproach 1 (MinMax)...")
approach_1 = prepare_data(
    data,          # Clean DataFrame without Nulls
    num_col,       # numerical features
    cat_col,       # caetgorical features
    target_col,    # target feature
    norm_method=:minmax #norm metghod, either :minmax or :zscore
)

#println("\n--- Approach 1 ---")
#println("To acces data:")
#println("approach_1.x_train")
#println("approach_1.y_train_cat (for SVM/DT/kNN)")
#println("approach_1.y_train_ohe (for ANN)")
#println("...")
#---------------------------------------------------------------------------------------------------------------------------------------
println(">> Ejecutando SVM - Approach 1 (MinMax)")

winner_minmax = run_svm_experiment(
    approach_1.x_train,      # Matriz de Training
    approach_1.y_train_cat,  # Etiquetas de Training (si son categóricas, la función las convierte)
    approach_1.x_test,       # Matriz de Test
    approach_1.y_test_cat,   # Etiquetas de Test
    "APPROACH 1: MINMAX",
    svm_configs
)

# Final Summary
println("\n\n****************************************************************")
println("      WINNER FOR APPROACH 1: MINMAX")
println("****************************************************************")
println("Approach: MinMax")
println("Best Config:",winner_minmax.config)
println("Sensib:",winner_minmax.sens)
println("Acc(%):",winner_minmax.acc)
println("Time(s):",winner_minmax.time)


>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)
 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------

 Init aproach 1 (MinMax)...

--- init Preprocess ---
   Normalization: minmax
    Stratigfied HoldOut split: 577 train, 125 val, 125 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1, 2, 3, 4]
--- PREPROCESS END SUCCESFULLY ---
>> Ejecutando SVM - Approach 1 (MinMax)

 EXPERIMENT: APPROACH 1: MINMAX (Manual Train/Test)
   [Auto-Fix] Transponiendo matriz para formato LIBSVM (Features x Samples)...
   Dimensiones finales X_train: (31, 577)
 Testing Config 1/8: RBF (C=1000000, G=0.03) ... Done. Sens: 0.3394 | Acc: 52.0%
 Testing Config 2/8: RBF (C=5000, G=0.07) ...

## Approach 2: Principal Component Analysis (PCA)

**Objective:** Reduce the dimensionality of the dataset to remove noise and improve training speed, while retaining the most important information.

**Methodology:**
1.  **Z-Score Normalization:** Data is centered (mean=0) and scaled (std=1), which is a prerequisite for PCA.
2.  **Dimensionality Reduction:** We apply **PCA** to project the original features into a smaller set of **Principal Components** that explain 95% of the variance.
3.  **SVM Training:** The SVM is trained on these transformed components. This often creates smoother decision boundaries and reduces the risk of overfitting.

In [9]:
using MLJMultivariateStatsInterface
PCA = MLJ.@load PCA pkg=MultivariateStats

data_path = "heart_disease_uci.csv"

data, num_col, cat_col, target_col = load_and_clean_data(data_path)

approach_2 = prepare_data(
    data,         
    num_col,       
    cat_col,       
    target_col,    
    norm_method=:zscore 
)

println("\n---data preprocessed---")

println("\n---Init PCA transformation---")


#Unpack variables for MLJ
x_train = approach_2.x_train
x_val = approach_2.x_val
x_test = approach_2.x_test

y_train_pca = approach_2.y_train_cat 
y_val_pca = approach_2.y_val_cat     
y_test_pca = approach_2.y_test_cat     

# Combine Train + Val (to adjust PCA) for models != ANN, for ANN take this into account
x_train_val_combined = vcat(x_train, x_val)
y_train_val_combined = vcat(y_train_pca, y_val_pca)

println(" Train set size: ", size(x_train_val_combined))


# Use PCA to select the components that explain 95% of the variance
pca_model = PCA(variance_ratio=0.95)

#1 Adjust the PCA only with the training data
pca_machine = machine(pca_model, MLJ.table(x_train_val_combined))
MLJ.fit!(pca_machine, verbosity=0)

#2 transform data
x_train_val_pca = MLJ.transform(pca_machine, MLJ.table(x_train_val_combined))

x_test_pca = MLJ.transform(pca_machine, MLJ.table(x_test))

#For MLJ is better to pass the data as table
#To see data as matrix use: mat_train_pca = MLJ.matrix(x_train_val_pca)
# Para ver los datos transformados como matriz:
mat_train_pca = MLJ.matrix(x_train_val_pca)
#---------------------------------------------------------------------------------------------------------------------------------------
mat_test_pca = MLJ.matrix(x_test_pca)
println("Train set size: after PCA: ", size(mat_train_pca))


println(">> Ejecutando SVM - Approach 2 (PCA)")

winner_pca = run_svm_experiment(
    mat_train_pca,      # Matriz de Training
    y_train_val_combined,  # Etiquetas de Training (si son categóricas, la función las convierte)
    mat_test_pca,       # Matriz de Test
    y_test_pca,   # Etiquetas de Test
    "APPROACH 2: PCA",
    svm_configs
)

# Final Summary
println("\n\n****************************************************************")
println("      WINNER FOR APPROACH 2: PCA")
println("****************************************************************")
println("Approach: MinMax")
println("Best Config:",winner_pca.config)
println("Sensib:",winner_pca.sens)
println("Acc(%):",winner_pca.acc)
println("Time(s):",winner_pca.time)

import MLJMultivariateStatsInterface ✔
>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)
 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------

--- init Preprocess ---
   Normalization: zscore
    Stratigfied HoldOut split: 577 train, 125 val, 125 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1, 2, 3, 4]
--- PREPROCESS END SUCCESFULLY ---

---data preprocessed---

---Init PCA transformation---
 Train set size: 

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mFor silent loading, specify `verbosity=0`. 


(702, 31)
Train set size: after PCA: (702, 17)
>> Ejecutando SVM - Approach 2 (PCA)

 EXPERIMENT: APPROACH 2: PCA (Manual Train/Test)
   [Auto-Fix] Transponiendo matriz para formato LIBSVM (Features x Samples)...
   Dimensiones finales X_train: (17, 702)
 Testing Config 1/8: RBF (C=1000000, G=0.03) ... Done. Sens: 0.3207 | Acc: 50.4%
 Testing Config 2/8: RBF (C=5000, G=0.07) ... Done. Sens: 0.383 | Acc: 53.6%
 Testing Config 3/8: RBF (C=100000.0, G=0.5) ... Done. Sens: 0.3511 | Acc: 56.8%
 Testing Config 4/8: Linear (C=1000) ... Done. Sens: 0.4201 | Acc: 50.4%
 Testing Config 5/8: Poly (Deg=2, C=10000.0, Coef=1.0) ... Done. Sens: 0.2725 | Acc: 44.8%
 Testing Config 6/8: Poly (Deg=5, C=10000.0, Coef=1.0) ... Done. Sens: 0.2984 | Acc: 48.0%
 Testing Config 7/8: Sigmoid (C=100.0, G=0.01) ... 




Done. Sens: 0.4201 | Acc: 50.4%
 Testing Config 8/8: Sigmoid (C=1000.0, G=0.01) ... Done. Sens: 0.3358 | Acc: 52.0%


****************************************************************
      WINNER FOR APPROACH 2: PCA
****************************************************************
Approach: MinMax
Best Config:("Linear (C=1000)", LIBSVM.Kernel.Linear, 1000.0, 0.0, 0)
Sensib:0.42007128007128003
Acc(%):50.4
Time(s):14.636384963989258


## Approach 3: Independent Component Analysis (ICA)

**Objective:** Transform the feature space to find statistically independent sources, which might correspond to underlying physiological factors of heart disease.

**Methodology:**
1.  **Transformation:** We use **ICA (FastICA)** to decompose the multivariate signal into independent non-Gaussian components.
2.  **Rationale:** Unlike PCA (which focuses on variance/correlation), ICA focuses on independence. In medical datasets, this can sometimes reveal hidden structures or "causes" that are useful for classification.
3.  **Kernel Selection:** Since ICA transforms the geometry of the data significantly, we test kernels like **Polynomial** and **Sigmoid** in addition to RBF.

In [10]:
# -----------------------------------------------------------------
# (Aproach 3) ICA and z-score (ICA just in numerical features , if we use it with the caegorical ones it wouldn't find a solution) 
# -----------------------------------------------------------------
using MLJMultivariateStatsInterface
ICA = MLJ.@load ICA pkg=MultivariateStats

data_path = "heart_disease_uci.csv"

data, num_col, cat_col, target_col = load_and_clean_data(data_path)


println("\nInit approach 3, ICA (Numerical features only)...")

# 1. Preprocess as previous aproach ( Z-Score, ideal for ICA
approach_ica = prepare_data(
    data,          
    num_col,       
    cat_col,       
    target_col,    
    norm_method=:zscore 
)

# 2. Unpack results
x_train = approach_ica.x_train
x_val = approach_ica.x_val
x_test = approach_ica.x_test

y_train_ica = approach_ica.y_train_cat 
y_val_ica = approach_ica.y_val_cat     
y_test_ica = approach_ica.y_test_cat;

# Our function  'prepare_data'order first numerical fetures and then categorical ones.

n_num = length(num_col) # Should be (age, trestbps, chol, thalch, oldpeak)

#Split 
x_num_train = x_train[:, 1:n_num]      # Just numerical 
x_cat_train = x_train[:, n_num+1:end]  # Just categorical OHE

x_num_val = x_val[:, 1:n_num]
x_cat_val = x_val[:, n_num+1:end]
x_num_test = x_test[:, 1:n_num]
x_cat_test = x_test[:, n_num+1:end]


# --- ICA just for numerical ---

# k should be less or equal than umber of features (5)
k_components = 2

#Random.seed!(1234)#ICA is a no deterministic method so we fix the seed for reproducibility. But somehow fail 
# Give some tolerance for the solution
ica_model = ICA(outdim=k_components, maxiter=100000, tol=0.2) 

println(" ICA with k=$k_components ...")

# Fit only the numerical data from training set, for ANN
ica_machine = machine(ica_model, MLJ.table(x_num_train))
MLJ.fit!(ica_machine, verbosity=1) # verbosity=1 for debug

"""
#for models != ANN:
x_train_val_num=vcat(x_num_train, x_num_val)
#fit on both, training and validation
ica_machine = machine(ica_model, MLJ.table(x_train_val_num))
MLJ.fit!(ica_machine, verbosity=1) # verbosity=1 for debug
"""

# Transform and return to matrix
x_num_train_ica = MLJ.transform(ica_machine, MLJ.table(x_num_train))
x_num_val_ica  = MLJ.transform(ica_machine, MLJ.table(x_num_val))
x_num_test_ica  = MLJ.transform(ica_machine, MLJ.table(x_num_test))


mat_train_ica = MLJ.matrix(x_num_train_ica)
mat_val_ica  = MLJ.matrix(x_num_val_ica)
mat_test_ica  = MLJ.matrix(x_num_test_ica)

#Add the categorical OHE 
x_train_ica = hcat(mat_train_ica, x_cat_train)
x_val_ica = hcat(mat_val_ica, x_cat_val)
x_test_ica     = hcat(mat_test_ica, x_cat_test)



#---------------------------------------------------------------------------------------------------------------------------------------


# ==============================================================================
# PREPARACIÓN FINAL PARA SVM (APPROACH 3: ICA)
# ==============================================================================

println("\n>> Preparando datos ICA para LIBSVM...")

# 1. Combinar Train + Validation
# SVM funciona mejor con más datos. Unimos las matrices que ya procesaste.
x_train_val_ica_combined = vcat(x_train_ica, x_val_ica)
y_train_val_ica_combined = vcat(y_train_ica, y_val_ica)

# 2. Transponer Matrices (Features x Samples)
# LIBSVM es estricto con esto. Convertimos a Float64 y transponemos.
x_train_svm_ica = Float64.(permutedims(x_train_val_ica_combined))
x_test_svm_ica  = Float64.(permutedims(x_test_ica))

# 3. Limpiar Etiquetas
# Nos aseguramos de que sean enteros simples (sin envoltorios CategoricalValue)
get_val(x) = (typeof(x) <: CategoricalValue) ? unwrap(x) : x

y_train_final_ica = vec(Int.(get_val.(y_train_val_ica_combined)))
y_test_final_ica  = vec(Int.(get_val.(y_test_ica)))


# ==============================================================================
# CONFIGURACIÓN Y EJECUCIÓN
# ==============================================================================

println("\n>> Ejecutando SVM - Approach 3 (ICA)")

winner_ica = run_svm_experiment(
    x_train_svm_ica,      # X Train (Combinado y transpuesto)
    y_train_final_ica,    # Y Train (Combinado y limpio)
    x_test_svm_ica,       # X Test (Transpuesto)
    y_test_final_ica,     # Y Test (Limpio)
    "APPROACH 3: ICA + ZSCORE",
    svm_configs
)

# Final Summary
println("\n\n****************************************************************")
println("      WINNER FOR APPROACH 3: ICA")
println("****************************************************************")
println("Approach: MinMax")
println("Best Config:",winner_ica.config)
println("Sensib:",winner_ica.sens)
println("Acc(%):",winner_ica.acc)
println("Time(s):",winner_ica.time)

import MLJMultivariateStatsInterface ✔
>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mFor silent loading, specify `verbosity=0`. 



 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------

Init approach 3, ICA (Numerical features only)...

--- init Preprocess ---
   Normalization: zscore
    Stratigfied HoldOut split: 577 train, 125 val, 125 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1, 2, 3, 4]
--- PREPROCESS END SUCCESFULLY ---
 ICA with k=2 ...


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTraining machine(ICA(outdim = 2, …), …).



>> Preparando datos ICA para LIBSVM...

>> Ejecutando SVM - Approach 3 (ICA)

 EXPERIMENT: APPROACH 3: ICA + ZSCORE (Manual Train/Test)
   Dimensiones finales X_train: (28, 702)
 Testing Config 1/8: RBF (C=1000000, G=0.03) ... Done. Sens: 0.3123 | Acc: 51.2%
 Testing Config 2/8: RBF (C=5000, G=0.07) ... Done. Sens: 0.3158 | Acc: 52.8%
 Testing Config 3/8: RBF (C=100000.0, G=0.5) ... Done. Sens: 0.2994 | Acc: 49.6%
 Testing Config 4/8: Linear (C=1000) ... Done. Sens: 0.3546 | Acc: 55.2%
 Testing Config 5/8: Poly (Deg=2, C=10000.0, Coef=1.0) ... Done. Sens: 0.3015 | Acc: 49.6%





 Testing Config 6/8: Poly (Deg=5, C=10000.0, Coef=1.0) ... 




Done. Sens: 0.2817 | Acc: 42.4%
 Testing Config 7/8: Sigmoid (C=100.0, G=0.01) ... Done. Sens: 0.3546 | Acc: 55.2%





 Testing Config 8/8: Sigmoid (C=1000.0, G=0.01) ... Done. Sens: 0.3545 | Acc: 54.4%


****************************************************************
      WINNER FOR APPROACH 3: ICA
****************************************************************
Approach: MinMax
Best Config:("Linear (C=1000)", LIBSVM.Kernel.Linear, 1000.0, 0.0, 0)
Sensib:0.3545792495792496
Acc(%):55.2
Time(s):16.113850831985474


## Approach 4: MinMax + Stratified Cross-Validation

**Objective:** Validate the robustness of our Baseline (Approach 1) and ensure the results are not due to a lucky train/test split.

**Methodology:**
* **Stratified K-Fold CV ($k=5$):** The dataset is divided into 5 folds, ensuring each fold maintains the same proportion of Sick/Healthy patients.
* **Process:** For each fold, we re-apply MinMax normalization and One-Hot Encoding from scratch to avoid data leakage.
* **Selection:** We average the metrics across all folds to find the most stable hyperparameter configuration.

In [11]:
# -----------------------------------------------------------------
# (Aproach 4) Same as approach 1 with corssvalidation 
# -----------------------------------------------------------------
data_path = "heart_disease_uci.csv"
data, num_col, cat_col, target_col = load_and_clean_data(data_path)
Random.seed!(1234)
#---split training data and final test data---
Pval = 0.15
Ptest = 0.15
rows, columns = size(data)
N = rows
(train_indices, val_indices, test_indices) = stratified_holdOut(data[!,target_col], Pval, Ptest)
train_data = data[train_indices, :]
val_data = data[val_indices, :]
dev_data = vcat(train_data, val_data)
test_data = data[test_indices, :]
println("  Data split: $(size(dev_data,1)) dev(85%), $(size(test_data,1)) test(15%)")

#---split for crossvalidation---
dev_num = select(dev_data, num_col)
dev_cat = select(dev_data, cat_col)
dev_targets = dev_data[!, target_col];

#---make cv indices---
k_folds=5 #numebr of folds, set to 5 as our dataset is small
cv_indices = crossvalidation(dev_targets, k_folds);
println("Indices generated for $k_folds stratified folds.")





println("Evaluando $(length(svm_configs)) configuraciones...")

# ... (código anterior de carga de datos igual) ...

results_grid = DataFrame(
    ModelType = String[], 
    Hyperparams = String[], 
    Mean_Accuracy = Float64[], 
    Std_Accuracy = Float64[]
)

println("Evaluando $(length(svm_configs)) configuraciones...")

for (i, conf) in enumerate(svm_configs)
    # Desempaquetar tu tupla de configuración
    # Asumiendo formato: (Nombre, Kernel, C, Gamma, Degree)
    # Si tienes 6 elementos (con coef0), ajusta esto.
    if length(conf) == 5
        (name, k_type, c_val, g_val, deg_val) = conf
        coef_val = 0.0
    else
        (name, k_type, c_val, g_val, deg_val, coef_val) = conf
    end
    
    # CONVERSIÓN CRÍTICA: Crear el Diccionario que espera la función
    # Necesitamos mapear tus tipos de Kernel de LIBSVM a Strings si la función interna lo requiere,
    # o pasarlos tal cual si la función sabe manejarlos.
    # Viendo el código de universalCrossValidation1, parece esperar :C, :kernel, etc.
    
    # Mapeo inverso de Kernel a String (por si acaso la función usa strings)
    k_str = if k_type == Kernel.Linear; "linear"
            elseif k_type == Kernel.RadialBasis; "rbf"
            elseif k_type == Kernel.Polynomial; "poly"
            elseif k_type == Kernel.Sigmoid; "sigmoid"
            else; "rbf"; end

    params_dict = Dict(
        :C => c_val,
        :kernel => k_str,
        :gamma => g_val,
        :degree => deg_val,
        :coef0 => coef_val
    )
    
    param_str = string(params_dict)
    print("[$i/$(length(svm_configs))] Probando SVC con $name ... ")
    
    # LLAMADA CORREGIDA
    try
        mu, sigma = universalCrossValidation1(
            :SVC,          # PASAR COMO SYMBOL (:SVC), NO STRING
            params_dict,   # PASAR COMO DICT, NO TUPLA
            dev_num, 
            dev_cat, 
            dev_targets, 
            cv_indices
        )
        
        push!(results_grid, ("SVC", param_str, mu * 100, sigma * 100))
        println("-> Acc: $(round(mu*100, digits=2))%")
        
    catch e
        println(" Falló: $e")
    end
end

# ... (Resto del código de resultados igual) ...

# 3. RESULTADOS Y MEJOR MODELO
println("\n--- TOP 10 MEJORES MODELOS ---")
sort!(results_grid, :Mean_Accuracy, rev=true)
display(first(results_grid, 10))

# Extraer el ganador
best_row = results_grid[1, :]
println("\n GANADOR DEL APPROACH 4:")
println("Modelo: $(best_row.ModelType)")
println("Config: $(best_row.Hyperparams)")
println("Accuracy CV: $(round(best_row.Mean_Accuracy, digits=2))% ± $(round(best_row.Std_Accuracy, digits=2))")

>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)
 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------
  Data split: 702 dev(85%), 125 test(15%)
Indices generated for 5 stratified folds.
Evaluando 8 configuraciones...
Evaluando 8 configuraciones...
[1/8] Probando SVC con RBF (C=1000000, G=0.03) ... -> Acc: 54.73%
[2/8] Probando SVC con RBF (C=5000, G=0.07) ... -> Acc: 54.57%
[3/8] Probando SVC con RBF (C=100000.0, G=0.5) ... -> Acc: 55.99%
[4/8] Probando SVC con Linear (C=1000) ... -> Acc: 55.0%
[5/8] Probando SVC con Poly (Deg=2, C=10000.0, Coef=1.0) ... -> Acc: 54.15%
[6/8] Probando SVC con Poly (Deg=5, C=10000.0, Coef=1.0) ... -> Acc: 55.13%
[7/8] Probando SVC con Sigmoid (C=100.0, G=0.01) ... -> Acc: 55.0%
[8/8] Probando SVC con Sigmoid (C=1000.0, G=0.01) ... -> Acc: 55.14%

--- TOP 10 MEJORES MODELOS ---


Row,ModelType,Hyperparams,Mean_Accuracy,Std_Accuracy
Unnamed: 0_level_1,String,String,Float64,Float64
1,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.5, :degree => 0, :kernel => ""rbf"", :C => 100000.0)",55.986,3.23921
2,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.01, :degree => 0, :kernel => ""rbf"", :C => 1000.0)",55.139,1.85978
3,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 1.0, :degree => 5, :kernel => ""poly"", :C => 10000.0)",55.1318,3.90574
4,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.01, :degree => 0, :kernel => ""linear"", :C => 100.0)",55.0022,2.73517
5,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.0, :degree => 0, :kernel => ""linear"", :C => 1000.0)",55.0002,2.53162
6,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.03, :degree => 0, :kernel => ""rbf"", :C => 1.0e6)",54.7285,5.85846
7,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.07, :degree => 0, :kernel => ""rbf"", :C => 5000.0)",54.5705,3.58129
8,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 1.0, :degree => 2, :kernel => ""poly"", :C => 10000.0)",54.15,4.97851



 GANADOR DEL APPROACH 4:
Modelo: SVC
Config: Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.5, :degree => 0, :kernel => "rbf", :C => 100000.0)
Accuracy CV: 55.99% ± 3.24


## Approach 5: PCA + Stratified Cross-Validation

**Objective:** Combine the benefits of dimensionality reduction (PCA) with the statistical robustness of Cross-Validation.

**Methodology:**
* **Pipeline:** Inside each step of the Cross-Validation loop:
    1.  **Z-Score** parameters are learned from the training fold.
    2.  **PCA** projection matrix is calculated on the training fold.
    3.  The validation fold is transformed using these learned parameters.
* **Goal:** This approach provides the most rigorous evaluation of whether dimensionality reduction truly helps the SVM generalize better to unseen patients.

In [12]:
# -----------------------------------------------------------------
# (Aproach 5) PCA with corssvalidation 
# -----------------------------------------------------------------
data_path = "heart_disease_uci.csv"
data, num_col, cat_col, target_col = load_and_clean_data(data_path)
Random.seed!(1234)
#---split training data and final test data---
Pval = 0.15
Ptest = 0.15
rows, columns = size(data)
N = rows
(train_indices, val_indices, test_indices) = stratified_holdOut(data[!,target_col], Pval, Ptest)
train_data = data[train_indices, :]
val_data = data[val_indices, :]
dev_data = vcat(train_data, val_data)
test_data = data[test_indices, :]
println("  Data split: $(size(dev_data,1)) dev(85%), $(size(test_data,1)) test(15%)")

#---split for crossvalidation---
dev_num = select(dev_data, num_col)
dev_cat = select(dev_data, cat_col)
dev_targets = dev_data[!, target_col];

#---make cv indices---
k_folds=5 #numebr of folds, set to 5 as our dataset is small
cv_indices = crossvalidation(dev_targets, k_folds);
println("Indices generated for $k_folds stratified folds.")



results_pca = DataFrame(Model=[], Params=[], Acc_Mean=[], Acc_Std=[])

for (i, conf) in enumerate(svm_configs)
    # Desempaquetar tu tupla de configuración
    # Asumiendo formato: (Nombre, Kernel, C, Gamma, Degree)
    # Si tienes 6 elementos (con coef0), ajusta esto.
    if length(conf) == 5
        (name, k_type, c_val, g_val, deg_val) = conf
        coef_val = 0.0
    else
        (name, k_type, c_val, g_val, deg_val, coef_val) = conf
    end
    
    # CONVERSIÓN CRÍTICA: Crear el Diccionario que espera la función
    # Necesitamos mapear tus tipos de Kernel de LIBSVM a Strings si la función interna lo requiere,
    # o pasarlos tal cual si la función sabe manejarlos.
    # Viendo el código de universalCrossValidation1, parece esperar :C, :kernel, etc.
    
    # Mapeo inverso de Kernel a String (por si acaso la función usa strings)
    k_str = if k_type == Kernel.Linear; "linear"
            elseif k_type == Kernel.RadialBasis; "rbf"
            elseif k_type == Kernel.Polynomial; "poly"
            elseif k_type == Kernel.Sigmoid; "sigmoid"
            else; "rbf"; end

    params_dict = Dict(
        :C => c_val,
        :kernel => k_str,
        :gamma => g_val,
        :degree => deg_val,
        :coef0 => coef_val
    )
    
    param_str = string(params_dict)
    print("[$i/$(length(svm_configs))] Probando SVC con $name ... ")
    
    mu, sigma = universalCrossValidation_PCA(
        :SVC, params_dict, 
        dev_num, dev_cat, dev_targets, 
        cv_indices
    )
    
    push!(results_pca, (string(:SVC), param_str, mu*100, sigma*100))
    println("-> Acc: $(round(mu*100, digits=2))%")
end

println("\n--- Ranking Approach 5 ---")
sort!(results_pca, :Acc_Mean, rev=true)
display(results_pca)

# Extraer el ganador
best_row = results_pca[1, :]
println("\n GANADOR DEL APPROACH 5:")
println("Modelo: $(best_row.Model)")
println("Config: $(best_row.Params)")
println("Accuracy CV: $(round(best_row.Acc_Mean, digits=2))% ± $(round(best_row.Acc_Std, digits=2))")

>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)
 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------
  Data split: 702 dev(85%), 125 test(15%)
Indices generated for 5 stratified folds.
[1/8] Probando SVC con RBF (C=1000000, G=0.03) ... -> Acc: 52.44%
[2/8] Probando SVC con RBF (C=5000, G=0.07) ... -> Acc: 53.29%
[3/8] Probando SVC con RBF (C=100000.0, G=0.5) ... -> Acc: 54.99%
[4/8] Probando SVC con Linear (C=1000) ... -> Acc: 56.41%
[5/8] Probando SVC con Poly (Deg=2, C=10000.0, Coef=1.0) ... -> Acc: 49.6%






[6/8] Probando SVC con Poly (Deg=5, C=10000.0, Coef=1.0) ... -> Acc: 54.99%
[7/8] Probando SVC con Sigmoid (C=100.0, G=0.01) ... -> Acc: 56.41%
[8/8] Probando SVC con Sigmoid (C=1000.0, G=0.01) ... -> Acc: 56.27%

--- Ranking Approach 5 ---


Row,Model,Params,Acc_Mean,Acc_Std
Unnamed: 0_level_1,Any,Any,Any,Any
1,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.0, :degree => 0, :kernel => ""linear"", :C => 1000.0)",56.4106,2.29659
2,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.01, :degree => 0, :kernel => ""linear"", :C => 100.0)",56.4106,2.29659
3,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.01, :degree => 0, :kernel => ""rbf"", :C => 1000.0)",56.2687,3.56011
4,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 1.0, :degree => 5, :kernel => ""poly"", :C => 10000.0)",54.9911,3.65772
5,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.5, :degree => 0, :kernel => ""rbf"", :C => 100000.0)",54.9879,3.20739
6,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.07, :degree => 0, :kernel => ""rbf"", :C => 5000.0)",53.2858,2.63328
7,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.03, :degree => 0, :kernel => ""rbf"", :C => 1.0e6)",52.4377,4.51294
8,SVC,"Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 1.0, :degree => 2, :kernel => ""poly"", :C => 10000.0)",49.5977,4.37313



 GANADOR DEL APPROACH 5:
Modelo: SVC
Config: Dict{Symbol, Any}(:coef0 => 0.0, :gamma => 0.0, :degree => 0, :kernel => "linear", :C => 1000.0)
Accuracy CV: 56.41% ± 2.3


# BINARIO

## 3. Problem Transformation: Binary Classification
The original dataset contains 5 classes:
* **0:** Healthy
* **1-4:** Different severity levels of heart disease.

Distinguishing between Severity 1 and Severity 2 is notoriously difficult due to noise and overlap. However, from a clinical triage perspective, the crucial question is: **"Is the patient sick?"**

**Strategy:**
We transform the problem into a **Binary Classification** task:
* **Class 0 (Healthy) $\rightarrow$ 0**
* **Classes 1, 2, 3, 4 (Sick) $\rightarrow$ 1**

This simplifies the decision boundary and is expected to significantly boost **Accuracy** and **Sensitivity**.

In [13]:
# ==============================================================================
# 1. CARGA, LIMPIEZA Y BINARIZACIÓN INICIAL
# ==============================================================================
using LIBSVM, JLD2, FileIO, Statistics, Printf, Random, CategoricalArrays
using DataFrames, CSV, MLJ, MultivariateStats # Y el resto de tus imports...

# Incluir tus utilidades (Asumo que existen y funcionan como en el PDF)
include("SVM_final_utils.jl")
# include("preprocess_utils.jl") # Si es necesario

# --- Configuración ---
Random.seed!(1234)
data_path = "heart_disease_uci.csv"

# --- Carga original ---
println(">> Cargando y limpiando datos originales...")
# Esta función devuelve el dataframe con clases [0,1,2,3,4] en target_col
data, num_col, cat_col, target_col = load_and_clean_data(data_path)
println("Clases originales encontradas: ", sort(unique(data[!, target_col])))

# ==============================================================================
# --- BLOQUE NUEVO: TRANSFORMACIÓN BINARIA ---
# ==============================================================================
println("\n>> BINARIZANDO EL DATASET (Sano=0 vs Enfermo=1)...")

# Aplicamos la lógica de tu imagen: Si es > 0 es 1, si no es 0.
# Modificamos la columna objetivo directamente en el DataFrame principal.
data[!, target_col] = map(y -> y > 0 ? 1 : 0, data[!, target_col])

# Verificación
classes_bin = sort(unique(data[!, target_col]))
println("Nuevas clases binarias: ", classes_bin)
println("Conteo de muestras: ")
println("  Clase 0 (Sano):   ", count(==(0), data[!, target_col]))
println("  Clase 1 (Enfermo): ", count(==(1), data[!, target_col]))
# ==============================================================================

# Definir configuraciones SVM (Sirven las mismas del PDF)
svm_configs = [
    ("RBF (C=1, G=0.1)",    Kernel.RadialBasis, 1.0,    0.1, 0, 0.0),
    ("RBF (C=10, G=0.1)",   Kernel.RadialBasis, 10.0,   0.1, 0, 0.0),
    ("RBF (C=100, G=0.01)", Kernel.RadialBasis, 100.0,  0.01, 0, 0.0),
    ("RBF (C=1000, G=0.001)", Kernel.RadialBasis, 1000.0, 0.001, 0, 0.0),
    ("Linear (C=1)",        Kernel.Linear,      1.0,    0.0, 0, 0.0),
    ("Linear (C=10)",       Kernel.Linear,      10.0,   0.0, 0, 0.0),
    ("Poly (D=2, C=1)",     Kernel.Polynomial,  1.0,    0.1, 2, 1.0),
    ("Sigmoid (C=1, G=0.1)", Kernel.Sigmoid,    1.0,    0.1, 0, 0.0)
]

>> Cargando y limpiando datos originales...




>>> Loading data from: heart_disease_uci.csv
  Original Size: (920, 14)
 Categorical Null values replaced with ---> 'missingval'.
  Deleted rows in features: [:trestbps, :chol, :thalch, :oldpeak]
  Final shape: (827, 14)
------------------------
Clases originales encontradas: [0, 1, 2, 3, 4]

>> BINARIZANDO EL DATASET (Sano=0 vs Enfermo=1)...
Nuevas clases binarias: [0, 1]
Conteo de muestras: 
  Clase 0 (Sano):   371
  Clase 1 (Enfermo): 456


8-element Vector{Tuple{String, LIBSVM.Kernel.KERNEL, Float64, Float64, Int64, Float64}}:
 ("RBF (C=1, G=0.1)", LIBSVM.Kernel.RadialBasis, 1.0, 0.1, 0, 0.0)
 ("RBF (C=10, G=0.1)", LIBSVM.Kernel.RadialBasis, 10.0, 0.1, 0, 0.0)
 ("RBF (C=100, G=0.01)", LIBSVM.Kernel.RadialBasis, 100.0, 0.01, 0, 0.0)
 ("RBF (C=1000, G=0.001)", LIBSVM.Kernel.RadialBasis, 1000.0, 0.001, 0, 0.0)
 ("Linear (C=1)", LIBSVM.Kernel.Linear, 1.0, 0.0, 0, 0.0)
 ("Linear (C=10)", LIBSVM.Kernel.Linear, 10.0, 0.0, 0, 0.0)
 ("Poly (D=2, C=1)", LIBSVM.Kernel.Polynomial, 1.0, 0.1, 2, 1.0)
 ("Sigmoid (C=1, G=0.1)", LIBSVM.Kernel.Sigmoid, 1.0, 0.1, 0, 0.0)

### Approach 1: MinMax Binary

In [14]:
println("\n=================================================================")
println(" APPROACH 1: MINMAX (BINARY)")
println("=================================================================")

# La función prepare_data ahora recibe datos binarios y los divide
# Asumo que norm_method=:minmax hace MinMax y OHE
approach_1 = prepare_data(data, num_col, cat_col, target_col, norm_method=:minmax)

# Entrenar (tu función run_svm_experiment debería manejar binario automáticamente)
winner_a1 = run_svm_experiment(
    approach_1.x_train, approach_1.y_train_cat,
    approach_1.x_test, approach_1.y_test_cat,
    "A1-MinMax-Binary", svm_configs
)


 APPROACH 1: MINMAX (BINARY)

--- init Preprocess ---
   Normalization: minmax
    Stratigfied HoldOut split: 579 train, 124 val, 124 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1]
--- PREPROCESS END SUCCESFULLY ---

 EXPERIMENT: A1-MinMax-Binary (Manual Train/Test)
   [Auto-Fix] Transponiendo matriz para formato LIBSVM (Features x Samples)...
   Dimensiones finales X_train: (31, 579)
 Testing Config 1/8: RBF (C=1, G=0.1) ... Done. Sens: 0.7857 | Acc: 81.45%
 Testing Config 2/8: RBF (C=10, G=0.1) ... Done. Sens: 0.75 | Acc: 80.65%
 Testing Config 3/8: RBF (C=100, G=0.01) ... Done. Sens: 0.8036 | Acc: 81.45%
 Testing Config 4/8: RBF (C=1000, G=0.001) ... Done. Sens: 0.8214 | Acc: 83.06%
 Testing Config 5/8: Linear (C=1) ... Done. Sens: 0.8214 | Acc: 83.06%
 Testing Config 6/8: Linear (C=10) ... Done.

(config = ("RBF (C=1000, G=0.001)", LIBSVM.Kernel.RadialBasis, 1000.0, 0.001, 0, 0.0),
 sens = 0.8214285714285714,
 acc = 83.06451612903226,
 f1 = 0.8141592920353982,
 time = 0.012856006622314453,
 model = LIBSVM.SVM{Int64, LIBSVM.Kernel.KERNEL}(LIBSVM.SVC, LIBSVM.Kernel.RadialBasis, Dict(0 => 1.1177606177606179, 1 => 0.9046875), 31, 579, 2, [1, 0], Int32[1, 2], [1.1177606177606179, 0.9046875], Int32[2, 1], LIBSVM.SupportVectors{Vector{Int64}, Matrix{Float64}}(238, Int32[134, 104], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1  …  0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0.0625 0.22916666666666666 … 0.6458333333333334 0.6875; 0.125 0.125 … 0.3333333333333333 0.6666666666666666; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], Int32[1, 4, 5, 11, 12, 17, 18, 29, 31, 32  …  527, 531, 532, 534, 550, 559, 561, 567, 570, 577], LIBSVM.SVMNode[LIBSVM.SVMNode(1, 0.0625), LIBSVM.SVMNode(1, 0.22916666666666666), LIBSVM.SVMNode(1, 0.3541666666666667), LIBSVM.SVMNode(1, 0.22916666666666666), LIBSVM.SVMNode(1, 0.7916666666666666)

### Approach 2: PCA Binary

In [15]:
println("\n=================================================================")
println(" APPROACH 2: PCA (BINARY)")
println("=================================================================")

# El PCA ahora se calculará sobre datos cuya separación óptima es binaria
approach_2 = prepare_data(data, num_col, cat_col, target_col, norm_method=:zscore)

using MLJMultivariateStatsInterface
PCA = MLJ.@load PCA pkg=MultivariateStats

data_path = "heart_disease_uci.csv"

println("\n---data preprocessed---")

println("\n---Init PCA transformation---")


#Unpack variables for MLJ
x_train = approach_2.x_train
x_val = approach_2.x_val
x_test = approach_2.x_test

y_train_pca = approach_2.y_train_cat 
y_val_pca = approach_2.y_val_cat     
y_test_pca = approach_2.y_test_cat     

# Combine Train + Val (to adjust PCA) for models != ANN, for ANN take this into account
x_train_val_combined = vcat(x_train, x_val)
y_train_val_combined = vcat(y_train_pca, y_val_pca)

println(" Train set size: ", size(x_train_val_combined))


# Use PCA to select the components that explain 95% of the variance
pca_model = PCA(variance_ratio=0.95)

#1 Adjust the PCA only with the training data
pca_machine = machine(pca_model, MLJ.table(x_train_val_combined))
MLJ.fit!(pca_machine, verbosity=0)

#2 transform data
x_train_val_pca = MLJ.transform(pca_machine, MLJ.table(x_train_val_combined))

x_test_pca = MLJ.transform(pca_machine, MLJ.table(x_test))

#For MLJ is better to pass the data as table
#To see data as matrix use: mat_train_pca = MLJ.matrix(x_train_val_pca)
# Para ver los datos transformados como matriz:
mat_train_pca = MLJ.matrix(x_train_val_pca)
#---------------------------------------------------------------------------------------------------------------------------------------
mat_test_pca = MLJ.matrix(x_test_pca)
println("Train set size: after PCA: ", size(mat_train_pca))

# La unión de etiquetas ahora es de 0s y 1s
y_train_val_combined = vcat(approach_2.y_train_cat, approach_2.y_val_cat)

println(">> Ejecutando SVM - Approach 2 (PCA)")

winner_a2 = run_svm_experiment(
    mat_train_pca, y_train_val_combined,
    mat_test_pca, approach_2.y_test_cat,
    "A2-PCA-Binary", svm_configs
)

# Final Summary
println("\n\n****************************************************************")
println("      WINNER FOR APPROACH 2: PCA")
println("****************************************************************")
println("Approach: MinMax")
println("Best Config:",winner_a2.config)
println("Sensib:",winner_a2.sens)
println("Acc(%):",winner_a2.acc)
println("Time(s):",winner_a2.time)


 APPROACH 2: PCA (BINARY)

--- init Preprocess ---
   Normalization: zscore
    Stratigfied HoldOut split: 579 train, 124 val, 124 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1]
--- PREPROCESS END SUCCESFULLY ---
import MLJMultivariateStatsInterface ✔

---data preprocessed---

---Init PCA transformation---
 Train set size: (703, 31)
Train set size: after PCA: (703, 17)
>> Ejecutando SVM - Approach 2 (PCA)

 EXPERIMENT: A2-PCA-Binary (Manual Train/Test)
   [Auto-Fix] Transponiendo matriz para formato LIBSVM (Features x Samples)...
   Dimensiones finales X_train: (17, 703)
 Testing Config 1/8: RBF (C=1, G=0.1) ... Done. Sens: 0.8214 | Acc: 86.29%
 Testing Config 2/8: RBF (C=10, G=0.1) ... Done. Sens: 0.8036 | Acc: 80.65%
 Testing Config 3/8: RBF (C=100, G=0.01) ... 

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mFor silent loading, specify `verbosity=0`. 


Done. Sens: 0.8571 | Acc: 83.87%
 Testing Config 4/8: RBF (C=1000, G=0.001) ... Done. Sens: 0.8393 | Acc: 86.29%
 Testing Config 5/8: Linear (C=1) ... Done. Sens: 0.8214 | Acc: 85.48%
 Testing Config 6/8: Linear (C=10) ... Done. Sens: 0.7857 | Acc: 84.68%
 Testing Config 7/8: Poly (D=2, C=1) ... Done. Sens: 0.8393 | Acc: 83.87%
 Testing Config 8/8: Sigmoid (C=1, G=0.1) ... Done. Sens: 0.7857 | Acc: 85.48%


****************************************************************
      WINNER FOR APPROACH 2: PCA
****************************************************************
Approach: MinMax
Best Config:("RBF (C=100, G=0.01)", LIBSVM.Kernel.RadialBasis, 100.0, 0.01, 0, 0.0)
Sensib:0.8571428571428571
Acc(%):83.87096774193549
Time(s):0.03200793266296387


### Approach 3 : PCA Binary

In [16]:
println("\n=================================================================")
println(" APPROACH 3: ICA (BINARY)")
println("=================================================================")
# -----------------------------------------------------------------
# (Aproach 3) ICA and z-score (ICA just in numerical features , if we use it with the caegorical ones it wouldn't find a solution) 
# -----------------------------------------------------------------
using MLJMultivariateStatsInterface
ICA = MLJ.@load ICA pkg=MultivariateStats

data_path = "heart_disease_uci.csv"


println("\nInit approach 3, ICA (Numerical features only)...")

# 1. Preprocess as previous aproach ( Z-Score, ideal for ICA
approach_ica = prepare_data(data, num_col, cat_col, target_col, norm_method=:zscore)

# 2. Unpack results
x_train = approach_ica.x_train
x_val = approach_ica.x_val
x_test = approach_ica.x_test

y_train_ica = approach_ica.y_train_cat 
y_val_ica = approach_ica.y_val_cat     
y_test_ica = approach_ica.y_test_cat;

# Our function  'prepare_data'order first numerical fetures and then categorical ones.

n_num = length(num_col) # Should be (age, trestbps, chol, thalch, oldpeak)

#Split 
x_num_train = x_train[:, 1:n_num]      # Just numerical 
x_cat_train = x_train[:, n_num+1:end]  # Just categorical OHE

x_num_val = x_val[:, 1:n_num]
x_cat_val = x_val[:, n_num+1:end]
x_num_test = x_test[:, 1:n_num]
x_cat_test = x_test[:, n_num+1:end]


# --- ICA just for numerical ---

# k should be less or equal than umber of features (5)
k_components = 2

#Random.seed!(1234)#ICA is a no deterministic method so we fix the seed for reproducibility. But somehow fail 
# Give some tolerance for the solution
ica_model = ICA(outdim=k_components, maxiter=100000, tol=0.2) 

println(" ICA with k=$k_components ...")

# Fit only the numerical data from training set, for ANN
ica_machine = machine(ica_model, MLJ.table(x_num_train))
MLJ.fit!(ica_machine, verbosity=1) # verbosity=1 for debug

"""
#for models != ANN:
x_train_val_num=vcat(x_num_train, x_num_val)
#fit on both, training and validation
ica_machine = machine(ica_model, MLJ.table(x_train_val_num))
MLJ.fit!(ica_machine, verbosity=1) # verbosity=1 for debug
"""

# Transform and return to matrix
x_num_train_ica = MLJ.transform(ica_machine, MLJ.table(x_num_train))
x_num_val_ica  = MLJ.transform(ica_machine, MLJ.table(x_num_val))
x_num_test_ica  = MLJ.transform(ica_machine, MLJ.table(x_num_test))


mat_train_ica = MLJ.matrix(x_num_train_ica)
mat_val_ica  = MLJ.matrix(x_num_val_ica)
mat_test_ica  = MLJ.matrix(x_num_test_ica)

#Add the categorical OHE 
x_train_ica = hcat(mat_train_ica, x_cat_train)
x_val_ica = hcat(mat_val_ica, x_cat_val)
x_test_ica     = hcat(mat_test_ica, x_cat_test)

#---------------------------------------------------------------------------------------------------------------------------------------


# ==============================================================================
# PREPARACIÓN FINAL PARA SVM (APPROACH 3: ICA)
# ==============================================================================

println("\n>> Preparando datos ICA para LIBSVM...")

# 1. Combinar Train + Validation
# SVM funciona mejor con más datos. Unimos las matrices que ya procesaste.
x_train_val_ica_combined = vcat(x_train_ica, x_val_ica)
y_train_val_ica_combined = vcat(y_train_ica, y_val_ica)

# 2. Transponer Matrices (Features x Samples)
# LIBSVM es estricto con esto. Convertimos a Float64 y transponemos.
x_train_svm_ica = Float64.(permutedims(x_train_val_ica_combined))
x_test_svm_ica  = Float64.(permutedims(x_test_ica))

# 3. Limpiar Etiquetas
# Nos aseguramos de que sean enteros simples (sin envoltorios CategoricalValue)
get_val(x) = (typeof(x) <: CategoricalValue) ? unwrap(x) : x

y_train_final_ica = vec(Int.(get_val.(y_train_val_ica_combined)))
y_test_final_ica  = vec(Int.(get_val.(y_test_ica)))


# ==============================================================================
# CONFIGURACIÓN Y EJECUCIÓN
# ==============================================================================

println("\n>> Ejecutando SVM - Approach 3 (ICA)")
winner_a3 = run_svm_experiment(
    x_train_svm_ica, y_train_final_ica, # Variables resultantes del proceso ICA
    x_test_svm_ica, y_test_final_ica,
    "A3-ICA-Binary", svm_configs # Quizás añadir configs Sigmoide/Poly aquí
)

# Final Summary
println("\n\n****************************************************************")
println("      WINNER FOR APPROACH 3: ICA")
println("****************************************************************")
println("Approach: MinMax")
println("Best Config:",winner_a3.config)
println("Sensib:",winner_a3.sens)
println("Acc(%):",winner_a3.acc)
println("Time(s):",winner_a3.time)


 APPROACH 3: ICA (BINARY)
import MLJMultivariateStatsInterface ✔

Init approach 3, ICA (Numerical features only)...

--- init Preprocess ---
   Normalization: zscore
    Stratigfied HoldOut split: 579 train, 124 val, 124 test
    Normalizing numerical features...
    ...Normalization completed.
    Encoding categorical features (OHE)...
    ...OHE completed.
    Concatenate numerical and categorical matrices...
    Classes stored for the target: [0, 1]
--- PREPROCESS END SUCCESFULLY ---
 ICA with k=2 ...

>> Preparando datos ICA para LIBSVM...

>> Ejecutando SVM - Approach 3 (ICA)


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mFor silent loading, specify `verbosity=0`. 
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTraining machine(ICA(outdim = 2, …), …).



 EXPERIMENT: A3-ICA-Binary (Manual Train/Test)
   Dimensiones finales X_train: (28, 703)
 Testing Config 1/8: RBF (C=1, G=0.1) ... Done. Sens: 0.8036 | Acc: 85.48%
 Testing Config 2/8: RBF (C=10, G=0.1) ... Done. Sens: 0.8036 | Acc: 80.65%
 Testing Config 3/8: RBF (C=100, G=0.01) ... Done. Sens: 0.8214 | Acc: 83.87%
 Testing Config 4/8: RBF (C=1000, G=0.001) ... Done. Sens: 0.8036 | Acc: 85.48%
 Testing Config 5/8: Linear (C=1) ... Done. Sens: 0.8393 | Acc: 85.48%
 Testing Config 6/8: Linear (C=10) ... Done. Sens: 0.8393 | Acc: 84.68%
 Testing Config 7/8: Poly (D=2, C=1) ... Done. Sens: 0.8214 | Acc: 83.87%
 Testing Config 8/8: Sigmoid (C=1, G=0.1) ... Done. Sens: 0.7679 | Acc: 78.23%


****************************************************************
      WINNER FOR APPROACH 3: ICA
****************************************************************
Approach: MinMax
Best Config:("Linear (C=1)", LIBSVM.Kernel.Linear, 1.0, 0.0, 0, 0.0)
Sensib:0.8392857142857143
Acc(%):85.48387096774194
Tim