In [3]:
from sklearn.neural_network import MLPClassifier
import numpy as np

In [13]:
X = np.array([[0., 0.], [1., 1.]])
y = [0, 1]

clf = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1)
clf.fit(X, y)

MLPClassifier(alpha=1e-05, hidden_layer_sizes=(5, 2), random_state=1,
              solver='lbfgs')

In [14]:
clf.coefs_
[coef.shape for coef in clf.coefs_]

[(2, 5), (5, 2), (2, 1)]

In [15]:
clf.intercepts_

[array([-0.14962269,  0.75950271, -0.5472481 ,  6.92417703, -0.87510813]),
 array([-0.47635084, -0.76834882]),
 array([8.53354251])]

**Regressor**

Class **MLPRegressor** implements a MLP that trains using backpropagation with NO activation function in the output layer. (AKA identity function as activation function).

It uses the square error as the loss function, and the output is a set of continuous values.

Also supports multi-output regression, in which samples can have more than one target.

**Classifier**

Class MLPClassifier implements a MLP algorithm that trains using backpropagation.

Supports multi-label and multi-class (softmax) classification.

Both train on
X - (n_samples, n_features) training samples
y - (n_samples) target values (class labels)

**Regularization**

Both use parameter **alpha** for regularization, avoiding overfitting by penalizing weights with large magnitudes. Can vary this with MLP:

https://scikit-learn.org/stable/auto_examples/neural_networks/plot_mlp_alpha.html#sphx-glr-auto-examples-neural-networks-plot-mlp-alpha-py

**Optimization**

Uses SGD, Adam or L-BFGS. 

**Scaling**

* Scale data as MLP sensitive to feature scaling
i.e. standardise to have 0 mean and 1 variance, or place attribute between 0 and 1 or -1 and 1.

Can use the **StandardScaler** to do this.

* Learning parameter alpha

Use Grid-SearchCV to find alpha usually in the range 10.0 ** -np.arrange(1,7)

* L-BFGS converges quick with better solutions on small datasets. For larger use Adam. SGD with momentum or nesterov's momentum can perform better if learning rate is correctly tuned.



In [19]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
StandardScaler()

StandardScaler()

# Basic End-End Scikit-learn workflow

In [20]:
import pandas as pd
import numpy as np

# Import dataset and save to a dataframe
#data_df = pd.read_csv()

# Group data into features and labels
#X = data_df.drop("target",axis=1)
#y = data_df["target"]

In [21]:
# Split data into training and test sets
#from sklearn.model_selection import train_test_split
#X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1, shuffle=True)

Figure out which model to use

https://scikit-learn.org/stable/tutorial/machine_learning_map/index.html

or 

https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier

https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPRegressor.html#sklearn.neural_network.MLPRegressor

In [23]:
# Instantiate an instance of the chosen model
#clf = ...

# Fit model to data
#clf.fit(X_train, y_train)

In [24]:
# Evaluate predictions
clf.score(X_test, y_test)

1.0