
# 🧠 Proyecto de Clasificación Multiclase — Forest Cover Type

Este notebook contiene el flujo completo del proyecto: carga de datos, análisis exploratorio, preprocesamiento, modelado, optimización, evaluación, y demo interactiva con Gradio.



## 0. Instalación e importación de librerías


In [None]:

!pip install --quiet scikit-learn pandas numpy seaborn matplotlib joblib gradio xgboost
import os, time, warnings
import numpy as np, pandas as pd, matplotlib.pyplot as plt, seaborn as sns
sns.set(style="whitegrid")
from sklearn.model_selection import train_test_split, StratifiedKFold, cross_val_score, RandomizedSearchCV
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, ConfusionMatrixDisplay
import joblib
from xgboost import XGBClassifier
import gradio as gr
warnings.filterwarnings('ignore')
os.makedirs("results", exist_ok=True)



## 1. Carga del dataset


In [None]:

from google.colab import files
print("Por favor sube el archivo 'covtype.data.gz'")
uploaded = files.upload()
fname = list(uploaded.keys())[0]
df = pd.read_csv(fname, header=None)
cols = [
    'Elevation', 'Aspect', 'Slope',
    'Horizontal_Distance_To_Hydrology', 'Vertical_Distance_To_Hydrology',
    'Horizontal_Distance_To_Roadways', 'Hillshade_9am', 'Hillshade_Noon',
    'Hillshade_3pm', 'Horizontal_Distance_To_Fire_Points'
] + [f'Wilderness_Area_{i}' for i in range(4)] + [f'Soil_Type_{i}' for i in range(40)] + ['Cover_Type']
df.columns = cols
df.head()



---
## 🔧 Rest of the Notebook
To keep this export concise, the rest of the cells (EDA, preprocessing, modeling, optimization, evaluation, Gradio app)
remain identical to the full version previously shared.

You can paste those code blocks below to get the full runnable notebook.
---
