# ‚úÖ Valida√ß√£o de Modelos em Machine Learning

Este notebook cobre:
- T√©cnicas de divis√£o de dados
- M√©tricas para regress√£o e classifica√ß√£o
- Valida√ß√£o cruzada com K-Fold e StratifiedKFold
- Import√¢ncia de m√©tricas t√©cnicas e de neg√≥cio


In [None]:
from sklearn.datasets import load_iris, make_regression
from sklearn.model_selection import train_test_split, KFold, StratifiedKFold, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LinearRegression
from sklearn.metrics import classification_report, confusion_matrix, roc_auc_score, accuracy_score
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

# Dataset de classifica√ß√£o
iris = load_iris()
X_class, y_class = iris.data, iris.target

# Dataset de regress√£o
X_reg, y_reg = make_regression(n_samples=200, n_features=5, noise=10, random_state=42)


In [None]:
# Holdout
X_train, X_test, y_train, y_test = train_test_split(X_class, y_class, test_size=0.3, random_state=42)

model = RandomForestClassifier()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Holdout - Accuracy:", accuracy_score(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(classification_report(y_test, y_pred))


In [None]:
# Stratified K-Fold
skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
model = RandomForestClassifier()

scores = cross_val_score(model, X_class, y_class, cv=skf, scoring='accuracy')
print("Stratified K-Fold Accuracy m√©dio:", scores.mean())


In [None]:
X_train_r, X_test_r, y_train_r, y_test_r = train_test_split(X_reg, y_reg, test_size=0.3, random_state=42)

reg = LinearRegression()
reg.fit(X_train_r, y_train_r)
y_pred_r = reg.predict(X_test_r)

print("MSE:", mean_squared_error(y_test_r, y_pred_r))
print("R¬≤:", r2_score(y_test_r, y_pred_r))


## üß≠ M√©tricas T√©cnicas vs M√©tricas de Neg√≥cio

| T√©cnicas               | Neg√≥cio                         |
|------------------------|----------------------------------|
| Accuracy, MSE, ROC     | Redu√ß√£o de churn, aumento de vendas |
| Tempo de infer√™ncia    | SLA de resposta                  |
| Interpreta√ß√£o de modelo| Decis√£o cl√≠nica ou regulat√≥ria   |
| Robustez               | Seguran√ßa em produ√ß√£o            |


## üõ°Ô∏è Avalia√ß√£o al√©m da m√©trica

- **Interpretabilidade**: SHAP, LIME
- **Justi√ßa (Fairness)**: an√°lise de vi√©s
- **Efici√™ncia Computacional**
- **Seguran√ßa**: modelos robustos contra dados adversariais
