# Atividade: Treinamento de Redes Neurais com Keras (Dados Tabulares)

Este notebook contém os dois exercícios solicitados:

- Exercício 1 — Classificação Multiclasse (Wine dataset)
- Exercício 2 — Regressão (California Housing dataset)

Este material foi adaptado e organizado com base no notebook `FINAL_Exemplo_Redes_Neurais_Com_Keras.ipynb` (exemplo de introdução ao treinamento de redes neurais).

Referências úteis (do material base):

- https://www.deeplearningbook.com.br/algoritmo-backpropagation-parte-2-treinamento-de-redes-neurais/
- https://keras.io/api/losses/
- https://www.deeplearningbook.com.br/aprendizado-com-a-descida-do-gradiente/
- https://towardsdatascience.com/gradient-descent-algorithm-a-deep-dive-cf04e8115f21
- https://www.deeplearningbook.com.br/funcao-de-ativacao/
- https://keras.io/api/optimizers/

Siga as células na ordem. Cada seção contém código executável e explicações. Recomenda-se usar um ambiente virtual com as dependências listadas em `requirements.txt`.

In [8]:
# Section: Install (optional) and import libraries
# Inspired by FINAL_Exemplo_Redes_Neurais_Com_Keras.ipynb

# If you need to install packages from within the notebook, uncomment the pip lines below.
# Note: in many setups it's better to install packages from the terminal.

# !pip install -r atividade/requirements.txt

import platform
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import sklearn
import tensorflow as tf

# Additional imports shown in the example notebook
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

print(f"Python: {platform.python_version()}")
print(f"NumPy: {np.__version__}")
print(f"pandas: {pd.__version__}")
print(f"scikit-learn: {sklearn.__version__}")
print(f"TensorFlow: {tf.__version__}")

# Set plotting defaults
%matplotlib inline
sns.set(style='whitegrid')

# Reproducibility seed
RANDOM_STATE = 42


ModuleNotFoundError: No module named 'sklearn'

In [None]:
# Ensure matplotlib and seaborn are installed and importable
# This installs into the same Python interpreter used by the notebook kernel.
import sys
import subprocess
import importlib

def ensure_pkg(pkg_name):
    try:
        return importlib.import_module(pkg_name)
    except Exception:
        print(f"Package '{pkg_name}' not found — installing...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", pkg_name])
        return importlib.import_module(pkg_name)

matplotlib = ensure_pkg('matplotlib')
seaborn = ensure_pkg('seaborn')
print('Installed/loaded:', 'matplotlib', matplotlib.__version__, 'seaborn', seaborn.__version__)

# After running this cell, if imports still fail, restart the kernel and re-run the notebook cells.

Package 'matplotlib' not found — installing...
Package 'seaborn' not found — installing...
Package 'seaborn' not found — installing...
Installed/loaded: matplotlib 3.10.6 seaborn 0.13.2
Installed/loaded: matplotlib 3.10.6 seaborn 0.13.2


In [None]:
# Section: Utility functions and constants
import time
from typing import Tuple
from sklearn.metrics import mean_squared_error, mean_absolute_error, accuracy_score


def rmse(y_true, y_pred):
    return mean_squared_error(y_true, y_pred, squared=False)


def plot_history(history, title: str = 'Training history'):
    plt.figure(figsize=(10,4))
    # loss
    plt.subplot(1,2,1)
    plt.plot(history.history['loss'], label='train_loss')
    if 'val_loss' in history.history:
        plt.plot(history.history['val_loss'], label='val_loss')
    plt.title(title + ' - loss')
    plt.legend()

    # metric (accuracy or mae) - try common keys
    plt.subplot(1,2,2)
    if 'accuracy' in history.history:
        plt.plot(history.history['accuracy'], label='train_acc')
        if 'val_accuracy' in history.history:
            plt.plot(history.history['val_accuracy'], label='val_acc')
        plt.title(title + ' - accuracy')
        plt.legend()
    elif 'mae' in history.history:
        plt.plot(history.history['mae'], label='train_mae')
        if 'val_mae' in history.history:
            plt.plot(history.history['val_mae'], label='val_mae')
        plt.title(title + ' - MAE')
        plt.legend()

    plt.tight_layout()
    plt.show()


# Constants
CLASSIFICATION_INPUTS = None
REGRESSION_INPUTS = None

print('Utility functions defined.')


In [None]:
# Section: Load and inspect data (Wine dataset for classification)
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

wine = load_wine()
X = wine.data
y = wine.target

print('Wine dataset loaded:')
print('  Shape:', X.shape)
print('  Classes:', np.unique(y))
print('  Feature names:', wine.feature_names)

# quick head using pandas DataFrame
df_wine = pd.DataFrame(X, columns=wine.feature_names)
df_wine['target'] = y

display(df_wine.head())

a = df_wine.describe().T
print('\nFeature summary (first rows):')
print(a.head())

# Train/test split and scaling
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=RANDOM_STATE, stratify=y)
scaler = StandardScaler()
X_train_s = scaler.fit_transform(X_train)
X_test_s = scaler.transform(X_test)

print('\nTrain/test sizes:', X_train_s.shape, X_test_s.shape)
CLASSIFICATION_INPUTS = X_train_s.shape[1]


ModuleNotFoundError: No module named 'sklearn'

In [None]:
# Section: Build, train and evaluate Keras model (Classification)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping

# Build model
model_clf = Sequential([
    Dense(32, activation='relu', input_shape=(CLASSIFICATION_INPUTS,)),
    Dense(32, activation='relu'),
    Dense(3, activation='softmax')
])
model_clf.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model_clf.summary()

# Prepare labels
y_train_cat = tf.keras.utils.to_categorical(y_train, num_classes=3)
y_test_cat = tf.keras.utils.to_categorical(y_test, num_classes=3)

es = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
history_clf = model_clf.fit(X_train_s, y_train_cat, validation_split=0.1, epochs=200, batch_size=16, callbacks=[es], verbose=2)

plot_history(history_clf, title='Keras Classification')

# Evaluate
keras_preds = np.argmax(model_clf.predict(X_test_s), axis=1)
from sklearn.metrics import classification_report, accuracy_score
print('Keras accuracy:', accuracy_score(y_test, keras_preds))
print('\nClassification report (Keras):')
print(classification_report(y_test, keras_preds, target_names=wine.target_names))

# scikit-learn baseline: RandomForestClassifier
from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=200, random_state=RANDOM_STATE)
rf.fit(X_train_s, y_train)
rf_preds = rf.predict(X_test_s)
print('\nRandomForest accuracy:', accuracy_score(y_test, rf_preds))
print('\nClassification report (RandomForest):')
print(classification_report(y_test, rf_preds, target_names=wine.target_names))

print('\nSummary:')
print('  Keras accuracy:', accuracy_score(y_test, keras_preds))
print('  RandomForest accuracy:', accuracy_score(y_test, rf_preds))


In [None]:
# Section: Load and inspect data (California Housing for regression)
from sklearn.datasets import fetch_california_housing

cal = fetch_california_housing()
Xr = cal.data
yr = cal.target

print('California Housing dataset loaded:')
print('  Shape:', Xr.shape)
print('  Feature names:', cal.feature_names)

df_cal = pd.DataFrame(Xr, columns=cal.feature_names)
df_cal['target'] = yr

display(df_cal.head())
print(df_cal.describe().T.head())

# Split and scale
Xr_train, Xr_test, yr_train, yr_test = train_test_split(Xr, yr, test_size=0.2, random_state=RANDOM_STATE)
scaler_r = StandardScaler()
Xr_train_s = scaler_r.fit_transform(Xr_train)
Xr_test_s = scaler_r.transform(Xr_test)

REGRESSION_INPUTS = Xr_train_s.shape[1]
print('Train/test sizes:', Xr_train_s.shape, Xr_test_s.shape)


In [None]:
# Section: Build, train and evaluate Keras model (Regression)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.callbacks import EarlyStopping

model_reg = Sequential([
    Dense(64, activation='relu', input_shape=(REGRESSION_INPUTS,)),
    Dense(32, activation='relu'),
    Dense(16, activation='relu'),
    Dense(1, activation='linear')
])
model_reg.compile(optimizer='adam', loss='mse', metrics=['mae'])
model_reg.summary()

es_r = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
history_reg = model_reg.fit(Xr_train_s, yr_train, validation_split=0.1, epochs=200, batch_size=32, callbacks=[es_r], verbose=2)

plot_history(history_reg, title='Keras Regression')

# Evaluate Keras
keras_r_preds = model_reg.predict(Xr_test_s).ravel()
keras_r_rmse = rmse(yr_test, keras_r_preds)
keras_r_mae = mean_absolute_error(yr_test, keras_r_preds)
print('Keras Regression RMSE:', keras_r_rmse)
print('Keras Regression MAE:', keras_r_mae)

# scikit-learn baselines
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestRegressor

lr = LinearRegression()
lr.fit(Xr_train_s, yr_train)
lr_preds = lr.predict(Xr_test_s)
lr_rmse = rmse(yr_test, lr_preds)

rf = RandomForestRegressor(n_estimators=100, random_state=RANDOM_STATE)
rf.fit(Xr_train_s, yr_train)
rf_preds = rf.predict(Xr_test_s)
rf_rmse = rmse(yr_test, rf_preds)

print('\nLinearRegression RMSE:', lr_rmse)
print('RandomForestRegressor RMSE:', rf_rmse)

print('\nSummary RMSEs:')
print('  Keras:', keras_r_rmse)
print('  LinearRegression:', lr_rmse)
print('  RandomForest:', rf_rmse)

# Plot predicted vs actual for best model (by RMSE)
best_name, best_rmse = min((('Keras', keras_r_rmse), ('LinearRegression', lr_rmse), ('RandomForest', rf_rmse)), key=lambda x: x[1])
print('\nBest model by RMSE:', best_name)

if best_name == 'Keras':
    y_pred_best = keras_r_preds
elif best_name == 'LinearRegression':
    y_pred_best = lr_preds
else:
    y_pred_best = rf_preds

plt.figure(figsize=(6,6))
plt.scatter(yr_test, y_pred_best, alpha=0.4)
plt.plot([yr_test.min(), yr_test.max()], [yr_test.min(), yr_test.max()], 'r--')
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.title(f'Predicted vs Actual ({best_name})')
plt.show()


## Discussão e próximos passos

- Os resultados apresentados acima comparam redes neurais em Keras com modelos clássicos do scikit-learn.
- Para comparações mais robustas, execute múltiplas sementes, ou use cross-validation.
- Salve modelos e pipelines usando `joblib` (scikit-learn) e `model.save()` (Keras) antes de entregar.

Exemplo rápido para salvar modelos:

```python
# scikit-learn
# joblib.dump(rf, 'rf_model.joblib')

# Keras
# model_reg.save('keras_reg_model.h5')
```

Boa prática: documente as versões dos pacotes e capture o ambiente (`pip freeze > requirements.txt`) para reprodutibilidade.