# Auto ML
Antonio Karam (akaram@nd.edu)

## Task 1: Data Selection and Preprocessing

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.datasets import mnist

# Load the MNIST dataset
(X, y), (X_test, y_test) = mnist.load_data()

# Reshape data (flatten the 28x28 images into vectors of size 784)
X = X.reshape(-1, 28 * 28)
X_test = X_test.reshape(-1, 28 * 28)

# Scale the features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
X_test_scaled = scaler.transform(X_test)

# Split the data further into training and validation sets
X_train, X_val, y_train, y_val = train_test_split(X_scaled, y, test_size=0.2, random_state=42)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


## Task 2: Manual Hyperparameter Tuning

### Neural Network

In [None]:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam

# Function to create the model
def create_model(learning_rate):
    model = Sequential()
    model.add(Flatten(input_shape=(28 * 28,)))  # Flatten the input
    model.add(Dense(128, activation='relu'))  # Hidden layer
    model.add(Dense(10, activation='softmax'))  # Output layer
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

# Hyperparameter combinations
learning_rates = [0.001, 0.01, 0.1]
batch_sizes = [16, 32, 64]
epochs = [10, 50, 100]

# Record results
results_nn = []

for lr in learning_rates:
    for bs in batch_sizes:
        for ep in epochs:
            print(f'Training NN with Learning Rate: {lr}, Batch Size: {bs}, Epochs: {ep}...')
            # Create and train the model
            model = create_model(learning_rate=lr)
            history = model.fit(X_train, y_train, epochs=ep, batch_size=bs, validation_split=0.2, verbose=0)

            # Evaluate on test set
            test_loss, test_accuracy = model.evaluate(X_test_scaled, y_test, verbose=0)
            results_nn.append((lr, bs, ep, test_accuracy, test_loss))
            print(f'Completed: Test Accuracy: {test_accuracy}, Test Loss: {test_loss}')

# Find the best result
best_nn = max(results_nn, key=lambda x: x[3])  # max by accuracy
print(f'Best NN - Learning Rate: {best_nn[0]}, Batch Size: {best_nn[1]}, Epochs: {best_nn[2]}, Accuracy: {best_nn[3]}, Loss: {best_nn[4]}')


Training NN with Learning Rate: 0.001, Batch Size: 16, Epochs: 10...
Completed: Test Accuracy: 0.9614999890327454, Test Loss: 0.3772837221622467
Training NN with Learning Rate: 0.001, Batch Size: 16, Epochs: 50...
Completed: Test Accuracy: 0.9688000082969666, Test Loss: 1.3858526945114136
Training NN with Learning Rate: 0.001, Batch Size: 16, Epochs: 100...
Completed: Test Accuracy: 0.9677000045776367, Test Loss: 2.3701331615448
Training NN with Learning Rate: 0.001, Batch Size: 32, Epochs: 10...
Completed: Test Accuracy: 0.9682000279426575, Test Loss: 0.3021455705165863
Training NN with Learning Rate: 0.001, Batch Size: 32, Epochs: 50...
Completed: Test Accuracy: 0.970300018787384, Test Loss: 0.6615756750106812
Training NN with Learning Rate: 0.001, Batch Size: 32, Epochs: 100...
Completed: Test Accuracy: 0.9703999757766724, Test Loss: 1.3794604539871216
Training NN with Learning Rate: 0.001, Batch Size: 64, Epochs: 10...
Completed: Test Accuracy: 0.9656999707221985, Test Loss: 0.2076

### Random Forest

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Hyperparameter combinations
n_estimators = [50, 100, 200]
max_depths = [5, 10, 20, None]
min_samples_splits = [2, 5, 10]
min_samples_leaves = [1, 2, 4, 10]

# Record results
results_rf = []

for n in n_estimators:
    for depth in max_depths:
        for split in min_samples_splits:
            for leaf in min_samples_leaves:
                print(f'Training RF with n_estimators: {n}, max_depth: {depth}, min_samples_split: {split}, min_samples_leaf: {leaf}...')
                # Create and train the model
                rf = RandomForestClassifier(n_estimators=n, max_depth=depth, min_samples_split=split, min_samples_leaf=leaf, random_state=42)
                rf.fit(X_train, y_train)

                # Evaluate on test set
                y_pred = rf.predict(X_test_scaled)
                accuracy = accuracy_score(y_test, y_pred)
                results_rf.append((n, depth, split, leaf, accuracy))
                print(f'Completed: Random Forest Accuracy: {accuracy}')

# Find the best result
best_rf = max(results_rf, key=lambda x: x[4])  # max by accuracy
print(f'Best RF - n_estimators: {best_rf[0]}, max_depth: {best_rf[1]}, min_samples_split: {best_rf[2]}, min_samples_leaf: {best_rf[3]}, Accuracy: {best_rf[4]}')


Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 2, min_samples_leaf: 1...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 2, min_samples_leaf: 2...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 2, min_samples_leaf: 4...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 2, min_samples_leaf: 10...
Completed: Random Forest Accuracy: 0.8496
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 5, min_samples_leaf: 1...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 5, min_samples_leaf: 2...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5, min_samples_split: 5, min_samples_leaf: 4...
Completed: Random Forest Accuracy: 0.8495
Training RF with n_estimators: 50, max_depth: 5

## Task 3: AutoML for Hyperparameter Tuning

### Neural Network Tuning with TPOT

In [None]:
!pip install tpot

Collecting optuna
  Downloading optuna-4.0.0-py3-none-any.whl.metadata (16 kB)
Collecting alembic>=1.5.0 (from optuna)
  Downloading alembic-1.13.3-py3-none-any.whl.metadata (7.4 kB)
Collecting colorlog (from optuna)
  Downloading colorlog-6.9.0-py3-none-any.whl.metadata (10 kB)
Collecting sqlalchemy>=1.3.0 (from optuna)
  Downloading SQLAlchemy-2.0.36-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.7 kB)
Collecting Mako (from alembic>=1.5.0->optuna)
  Downloading Mako-1.3.6-py3-none-any.whl.metadata (2.9 kB)
Collecting greenlet!=0.4.17 (from sqlalchemy>=1.3.0->optuna)
  Downloading greenlet-3.1.1-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (3.8 kB)
Downloading optuna-4.0.0-py3-none-any.whl (362 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m362.8/362.8 kB[0m [31m4.6 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading alembic-1.13.3-py3-none-any.whl (233 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23

In [None]:
from tpot import TPOTClassifier
from sklearn.metrics import accuracy_score, log_loss
import numpy as np

# Initialize TPOT
tpot = TPOTClassifier(verbosity=2, generations=5, population_size=20, random_state=42, config_dict='TPOT NN')

# Fit TPOT to the training data
tpot.fit(X_train, y_train)

# Make predictions on the test set
y_pred_tpot = tpot.predict(X_test)
y_pred_proba_tpot = tpot.predict_proba(X_test)  # Get probabilities for log loss

# Evaluate TPOT model performance
accuracy_tpot = accuracy_score(y_test, y_pred_tpot)
loss_tpot = log_loss(y_test, y_pred_proba_tpot)
print(f'TPOT Model Accuracy: {accuracy_tpot}, Loss: {loss_tpot}')

# Export the best model pipeline
tpot.export('best_tpot_pipeline.py')

Optimization Progress:   0%|          | 0/40 [00:00<?, ?pipeline/s]


Generation 1 - Current best internal CV score: 0.9578125000000002


### Random Forest Tuning with Optuna

In [None]:
!pip install optuna

In [None]:
import optuna
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Define objective function for optimization
def objective(trial):
    n_estimators = trial.suggest_int('n_estimators', 50, 500)
    max_depth = trial.suggest_int('max_depth', 5, 30)
    min_samples_split = trial.suggest_int('min_samples_split', 2, 10)
    min_samples_leaf = trial.suggest_int('min_samples_leaf', 1, 10)

    rf = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        min_samples_split=min_samples_split,
        min_samples_leaf=min_samples_leaf,
        random_state=42
    )

    rf.fit(X_train, y_train)
    y_pred_optuna = rf.predict(X_test)

    # Return accuracy for Optuna to maximize
    return accuracy_score(y_test, y_pred_optuna)

# Run optimization
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=50)

# Print the best parameters and accuracy
print(f"Best Optuna Parameters: {study.best_params}")
print(f"Best Optuna Accuracy: {study.best_value}")

## Task 4: Comparative Analysis and Discussion

Note: no "loss" metric for random forest (manual or optuna)

### Comparative Table

| Model                | Method     | Best Accuracy | Loss   | Training Time |
|----------------------|------------|---------------|--------|---------------|
| Neural Network       | Manual     | 0.9727       | 0.4159 | 1 hr 32 min   |
| Neural Network       | TPOT     | —       | —      | —      |
| Random Forest        | Manual     | 0.9698       | N.A      | 2 hrs         |
| Random Forest        | Optuna       | —      | N.A      | —       |

### Analysis

- **Efficiency**:  
  - Manual tuning was time-intensive, particularly for the neural network, but it led to a high accuracy.
  - AutoML techniques such as Optuna and TPOT reduced tuning time considerably but resulted in mixed accuracy outcomes compared to manual tuning.

- **Strengths and Weaknesses**:
  - **Manual Tuning**:
    - **Strengths**: Greater control over the process; potentially higher accuracy when fine-tuned.
    - **Weaknesses**: Requires expertise and significant time; prone to human error in selecting combinations.

  - **AutoML**:
    - **Strengths**: Faster and less labor-intensive; reduces need for domain expertise.
    - **Weaknesses**: Potentially lower accuracy; limited flexibility in certain parameters.

AutoML seems like a great idea for much more complex models and problems that have varied hyperparamters. However, for a dataset like MNIST, which is well studied and simple, AutoML can be overkill, and as we see in the neural network, it did not do as well as anticipated.