___
<h1> Machine Learning </h1>
<h2> M. Sc. in Electrical and Computer Engineering </h2>
<h3> Instituto Superior de Engenharia / Universidade do Algarve </h3>

[MEEC](https://ise.ualg.pt/en/curso/1477) / [ISE](https://ise.ualg.pt) / [UAlg](https://www.ualg.pt)

Pedro J. S. Cardoso (pcardoso@ualg.pt)
___

# Neural Networks in Sklearn

In [None]:
from sklearn.datasets import load_iris, load_digits
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split

import matplotlib.pyplot as plt

## Iris dataset
In this section, we'll use the Iris dataset to make out first example.

So, load and split data:

In [None]:
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=1)

Prepare a multi-layer perceptron classifier (MLPClassifier), train and get the score over the test data

(https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html)

In [None]:
clf = MLPClassifier(
    # verbose=True  # uncomment to see loss function evolution
    random_state=1
).fit(X_train, y_train)
clf.score(X_test, y_test)

ok...!? let us see if we can improve this... The used parameters were:

In [None]:
clf.get_params()

What if the maximum number of iterations (epochs) is increased?

In [None]:
clf = MLPClassifier(
    max_iter=1000,
    random_state=1
    # verbose=True  # uncomment to see loss function evolution
).fit(X_train, y_train)

clf.score(X_test, y_test)

That is good! Maybe there were other alternatives, like using more layers...?

In [None]:
clf = MLPClassifier(
    hidden_layer_sizes=(100, 100),
    random_state=1,
    # verbose=True
).fit(X_train, y_train)

clf.score(X_test, y_test)

The probabilities associated to each test instance are 

In [None]:
clf.predict_proba(X_test)

In [None]:
plt.hist(clf.predict_proba(X_test))

## Digits dataset

Let us do a similar analysis using the digits dataset

In [None]:
digits = load_digits()
X_train, X_test, y_train, y_test = train_test_split(digits.data, digits.target, random_state=1)

With default parameters the results are the following

In [None]:
clf = MLPClassifier(random_state=1).fit(X_train, y_train)

clf.score(X_test, y_test)

Not bad! Can it be improved?

In [None]:
clf = MLPClassifier(
    max_iter=1000,
    hidden_layer_sizes=(100, 100),
    random_state=1
).fit(X_train, y_train)

clf.score(X_test, y_test)

It was not worst but there isn't much space to improve. Another try...?

In [None]:
clf = MLPClassifier(
    max_iter=1000,
    tol=1e-10,
    hidden_layer_sizes=(1000),
    random_state=1
).fit(X_train, y_train)

clf.score(X_test, y_test)

And again...!? Since it did not converge, let us increase the maximum iterations number

In [None]:
clf = MLPClassifier(
    max_iter=10000,
    tol=1e-10,
    hidden_layer_sizes=(1000,),
    random_state=1,
    activation='tanh',
).fit(X_train, y_train)

clf.score(X_test, y_test)