# Parte I. 

Programa y valida el Clasificador KNN, valídalo con 3 datasets (Iris, Wine y Digits) y los siguientes métodos de validación. 

Hold-Out 70/30 estratificado
10-Fold Cross-Validation estratificado
Leave-One-Out.

Pasos que realizare esta practica:

Implementar el Clasificador KNN usando Scikit-learn (KNeighborsClassifier).

Evaluar con diferentes valores de k (por ejemplo: 1, 3, 5, 7, 9).

Aplicar los tres métodos de validación (Hold-Out, 10-Fold CV y LOO).

Analizar los resultados para determinar el mejor valor de k.


In [1]:
# Importar librerías necesarias
from sklearn.datasets import load_iris, load_wine, load_digits
from sklearn.model_selection import train_test_split, StratifiedKFold, LeaveOneOut
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.neighbors import KNeighborsClassifier
import numpy as np

# Función para evaluar KNN con diferentes valores de k
def evaluate_knn(X, y, k_values, dataset_name):
    print(f"\nResultados para el Dataset: {dataset_name}")
    results = {}
    for k in k_values:
        print(f"\nEvaluando KNN con k={k}")
        knn = KNeighborsClassifier(n_neighbors=k)

        # Hold-Out 70/30 estratificado
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=42)
        knn.fit(X_train, y_train)
        y_pred = knn.predict(X_test)
        acc_holdout = accuracy_score(y_test, y_pred)
        cm_holdout = confusion_matrix(y_test, y_pred)
        print(f"Hold-Out Accuracy: {acc_holdout:.4f}")
        print(f"Hold-Out Confusion Matrix:\n{cm_holdout}")

        # 10-Fold Cross-Validation estratificado
        skf = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
        cv_scores = []
        for train_idx, test_idx in skf.split(X, y):
            X_train, X_test = X[train_idx], X[test_idx]
            y_train, y_test = y[train_idx], y[test_idx]
            knn.fit(X_train, y_train)
            y_pred = knn.predict(X_test)
            cv_scores.append(accuracy_score(y_test, y_pred))
        acc_cv = np.mean(cv_scores)
        print(f"10-Fold CV Accuracy: {acc_cv:.4f}")

        # Leave-One-Out
        loo = LeaveOneOut()
        loo_scores = []
        for train_idx, test_idx in loo.split(X):
            X_train, X_test = X[train_idx], X[test_idx]
            y_train, y_test = y[train_idx], y[test_idx]
            knn.fit(X_train, y_train)
            y_pred = knn.predict(X_test)
            loo_scores.append(accuracy_score(y_test, y_pred))
        acc_loo = np.mean(loo_scores)
        print(f"Leave-One-Out Accuracy: {acc_loo:.4f}")

        # Guardar resultados
        results[k] = {
            "Hold-Out Accuracy": acc_holdout,
            "10-Fold CV Accuracy": acc_cv,
            "Leave-One-Out Accuracy": acc_loo,
            "Confusion Matrix (Hold-Out)": cm_holdout
        }
    return results

# Cargar datasets y evaluar KNN
datasets = {
    "Iris": load_iris(),
    "Wine": load_wine(),
    "Digits": load_digits()
}

k_values = [1, 3, 5, 7, 9]  # Valores de k para evaluar

# Generar resultados para los 3 datasets
all_results = {}
for name, data in datasets.items():
    X, y = data.data, data.target
    results = evaluate_knn(X, y, k_values, name)
    all_results[name] = results

# Mostrar resumen de los mejores resultados
print("\nResumen de Resultados:")
for dataset_name, dataset_results in all_results.items():
    print(f"\nDataset: {dataset_name}")
    for k, metrics in dataset_results.items():
        print(f"k={k} -> Hold-Out: {metrics['Hold-Out Accuracy']:.4f}, "
              f"10-Fold CV: {metrics['10-Fold CV Accuracy']:.4f}, "
              f"Leave-One-Out: {metrics['Leave-One-Out Accuracy']:.4f}")



Resultados para el Dataset: Iris

Evaluando KNN con k=1
Hold-Out Accuracy: 0.9333
Hold-Out Confusion Matrix:
[[15  0  0]
 [ 0 15  0]
 [ 0  3 12]]
10-Fold CV Accuracy: 0.9600
Leave-One-Out Accuracy: 0.9600

Evaluando KNN con k=3
Hold-Out Accuracy: 0.9556
Hold-Out Confusion Matrix:
[[15  0  0]
 [ 0 15  0]
 [ 0  2 13]]
10-Fold CV Accuracy: 0.9600
Leave-One-Out Accuracy: 0.9600

Evaluando KNN con k=5
Hold-Out Accuracy: 0.9778
Hold-Out Confusion Matrix:
[[15  0  0]
 [ 0 15  0]
 [ 0  1 14]]
10-Fold CV Accuracy: 0.9533
Leave-One-Out Accuracy: 0.9667

Evaluando KNN con k=7
Hold-Out Accuracy: 0.9556
Hold-Out Confusion Matrix:
[[15  0  0]
 [ 0 15  0]
 [ 0  2 13]]
10-Fold CV Accuracy: 0.9733
Leave-One-Out Accuracy: 0.9667

Evaluando KNN con k=9
Hold-Out Accuracy: 0.9556
Hold-Out Confusion Matrix:
[[15  0  0]
 [ 0 15  0]
 [ 0  2 13]]
10-Fold CV Accuracy: 0.9600
Leave-One-Out Accuracy: 0.9667

Resultados para el Dataset: Wine

Evaluando KNN con k=1
Hold-Out Accuracy: 0.7037
Hold-Out Confusion Matr