# The Basics of Neural Networks

In this module we are going to learn about neural networks in scikit-learn. While scikit-learn is not the most used library for neural networks and deep learning, there is some built-in functionality in scikit-learn that can be used. We are going to focus on how neural networks conceptually work, and demonstrate basic usage of the scikit-learn implementation.

<b>Functions and attributes in this lecture: </b>
- `sklearn.neural_network` - Submodule for neural networks
 - `MLPClassifier` - The Multi-Layer Perceptron implementation for classification

In [1]:
# Non-sklearn packages
import numpy as np
import pandas as pd

# Sklearn packages
from sklearn.datasets import fetch_covtype
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier, MLPRegressor

# Importing the dataset
X, y = fetch_covtype(return_X_y=True, as_frame=True)

# Printing the description for the dataset
print(fetch_covtype()["DESCR"])

.. _covtype_dataset:

Forest covertypes
-----------------

The samples in this dataset correspond to 30×30m patches of forest in the US,
collected for the task of predicting each patch's cover type,
i.e. the dominant species of tree.
There are seven covertypes, making this a multiclass classification problem.
Each sample has 54 features, described on the
`dataset's homepage <https://archive.ics.uci.edu/ml/datasets/Covertype>`__.
Some of the features are boolean indicators,
while others are discrete or continuous measurements.

**Data Set Characteristics:**

    Classes                        7
    Samples total             581012
    Dimensionality                54
    Features                     int

:func:`sklearn.datasets.fetch_covtype` will load the covertype dataset;
it returns a dictionary-like 'Bunch' object
with the feature matrix in the ``data`` member
and the target values in ``target``. If optional argument 'as_frame' is
set to 'True', it will return ``data`` and ``target`

## Basic Usage of the MLPClassifier

Let us first see how the MLPClassifier works. It has a very familiar interface, and should thus not be difficult to get started with.

In [2]:
# Split the data into training set and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [3]:
# Training the MLPClassifier
mlp_classifier = MLPClassifier(hidden_layer_sizes=(5,), activation='relu', random_state=42, max_iter=300).fit(X_train, y_train)

In [4]:
# Can predict probabilities of a new observation
mlp_classifier.predict_proba(X_test[:1])

array([[0.36299292, 0.48981176, 0.06108747, 0.00470559, 0.01609391,
        0.02990747, 0.03540088]])

In [5]:
# Can predict by choosing the probability with the highest score
mlp_classifier.predict(X_test[:1])

array([2])

In [6]:
# Mean accuracy of the model
mlp_classifier.score(X_test, y_test)

0.48777472957326296

## Checking out some of the parameters

In [7]:
# Choosing other hidden layer values
mlp_classifier_two_hidden_layers = MLPClassifier(hidden_layer_sizes=(13, 6), activation='relu', random_state=42, max_iter=300).fit(X_train, y_train)

In [8]:
# Checking out the accuracy again
mlp_classifier_two_hidden_layers.score(X_test, y_test)

0.7278260506743718

You can basically treat the MLPClassifier like any other estimator in scikit-learn. Doing hyperparameter-search over the parameter `hidden_layer_sizes` is very expensive time-wise. There is a lot of speed improvements that GPUs can give to neural networks, but this is not available in the scikit-learn implementation. I suggest the library `Keras` if you are interested in learning more about neural networks. 