# Model fitting

This model train a simple neural net and assesses its performance.

In [None]:
import numpy as np
import scipy.stats as st
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import accuracy_score

This notebook is parameterized to work with [Papermill](https://papermill.readthedocs.io).
The following cell contains the default values of the parameters.

In [None]:
input_file = "test_dataset.npz"
max_units = 15
n_budget = 1
n_jobs = 1

First we load the training dataset.

In [None]:
dset = np.load(input_file)
print(f"Number of training samples: {len(dset['X_train'])}.")
print(f"Number of test samples: {len(dset['X_test'])}.")

Then define a MLP model, using a random search strategy to optimize some hyperparameters.

In [None]:
param_space = {
    "hidden_layer_sizes": st.randint(10, max_units),
    "alpha": st.loguniform(1e-5, 1e-2),
    "learning_rate_init": st.loguniform(1e-4, 1e-1),
}
mlp = RandomizedSearchCV(
    MLPClassifier(random_state=42, max_iter=1000),
    param_space,
    n_iter=n_budget,
    random_state=42,
    verbose=1,
    n_jobs=n_jobs,
    cv=3,
)

Fitting will take more or less time depending on the optimization budget (number of configurations tested) and the number of parallel jobs used.

In [None]:
%%time
_ = mlp.fit(dset["X_train"], dset["y_train"])

In [None]:
mlp.best_params_

Finally, we check the accuracy on the test dataset.

In [None]:
y_pred = mlp.predict(dset["X_test"])
mlp_acc = accuracy_score(dset["y_test"], y_pred)
print(f"MLP test accuracy is {mlp_acc * 100:.2f}%.")