---------------------------------
#### Single layer perceptron - using sklearn
-----------------------

- `sklearn.linear_model.Perceptron`

The Perceptron is another simple classification algorithm suitable for large scale learning. By default:

- It does not require a learning rate.
- It is not regularized (penalized).
- It updates its model only on mistakes.

The last characteristic implies that the Perceptron is slightly `faster `

#### Parameters
- `penalty` {`l2`, `l1`, `elasticnet`}, `default=None` - The penalty (aka `regularization` term) to be used.

- `alpha` float, default=0.0001 - Constant that multiplies the regularization term `if regularization` is used.

- `l1_ratio` float, default=0.15 - The `Elastic Net` mixing parameter, with 0 <= l1_ratio <= 1. 
        - l1_ratio=0 corresponds to L2 penalty, 
        - l1_ratio=1 to L1. 
    - Only used if penalty='elasticnet'.

- `fit_intercept` bool, `default=True` - Whether the intercept should be estimated or not. `If False`, the data is assumed to be already centered.

- `max_iter` int, default=1000 - The maximum number of passes over the training data (aka epochs). 

- `tol` float, default=1e-3 - The `stopping criterion`. 
    - If it is not `None`, the iterations will stop when (`loss > previous_loss - tol`).

- `shuffle` bool, default=True - Whether or not the training data should be shuffled after each epoch.

- `verbose` int, default=0 - The verbosity level

- `eta` - double, `default=1` - Constant by which the updates are multiplied.

- `early_stopping` bool, `default=False` - Whether to use early stopping to terminate training when validation. score is not improving. 
    - If set to True, it will automatically set aside a `stratified fraction` of training data as validation and terminate training when validation score is not improving by at least tol for `n_iter_no_change` consecutive epochs.

- `validation_fraction` float, default=0.1 - The proportion of training data to set aside as validation set for `early stopping`. Must be between 0 and 1. 
    - `Only used if early_stopping is True`.

- `n_iter_no_change` int, default=5 - Number of iterations with no improvement to wait before early stopping.

- `class_weight` dict, {class_label: weight} or “balanced”, `default=None` - Preset for the class_weight fit parameter. Weights associated with classes. 
    - If not given, all classes are supposed to have weight one.
    - The “balanced” mode uses the values of y to automatically adjust weights inversely proportional to class frequencies in the input data as n_samples / (n_classes * np.bincount(y))

    - `warm_start` - bool, `default=False` - When set to True, reuse the solution of the previous call to fit as initialization, otherwise, just erase the previous solution. 

In [1]:
# Load required libraries
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

from sklearn.linear_model import Perceptron

from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
import numpy as np

In [2]:
# Load The Iris Data
iris = datasets.load_iris()

# Create our X and y data
X = iris.data
y = iris.target

In [3]:
X.shape

(150, 4)

In [4]:
# Split the data into 70% training data and 30% test data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

In [5]:
X_train.shape, X_test.shape

((105, 4), (45, 4))

In [7]:
# Train the scaler, which standarizes all the features to have mean=0 and unit variance
sc = StandardScaler()

sc.fit(X_train)

StandardScaler()

In [8]:
sc.mean_,sc.var_

(array([5.78666667, 3.04761905, 3.72380952, 1.2       ]),
 array([0.62839365, 0.19239909, 2.97000454, 0.56285714]))

In [9]:
# Apply the scaler to the X training data
X_train_std = sc.transform(X_train)

# Apply the SAME scaler to the X test data
X_test_std = sc.transform(X_test)

#### Create a perceptron object with the parameters: 
- 40 iterations (epochs) over the data, 
- and a learning rate of 0.1

-`Fit` linear model with `Stochastic Gradient Descent`.

In [10]:
ppn = Perceptron(max_iter = 90, 
                 eta0     = .001,  # learning rate
                 tol      = 0.1,   # default, 0.001
                 random_state = 0)

# Train the perceptron
ppn.fit(X_train_std, y_train)

Perceptron(eta0=0.001, max_iter=90, tol=0.1)

In [11]:
print('Classes   : \n', ppn.classes_)
print('Weights   : \n', ppn.coef_)
print('Intercept : \n', ppn.intercept_)
print('nbr Iter  : \n', ppn.n_iter_)

Classes   : 
 [0 1 2]
Weights   : 
 [[-1.68198780e-05  2.38837148e-03 -1.43683035e-03 -1.59949231e-03]
 [-1.65675799e-03 -2.85518954e-03  3.20523693e-03 -1.86607436e-03]
 [-4.79366524e-04 -1.40045418e-03  3.50641868e-03  3.33227564e-03]]
Intercept : 
 [-0.001 -0.001 -0.006]
nbr Iter  : 
 6


In [12]:
# Apply the trained perceptron on the X data to make predicts for the y test data
y_pred = ppn.predict(X_test_std)

# View the accuracy of the model, which is: 1 - (observations predicted wrong / total observations)
print('Accuracy: %.2f' % accuracy_score(y_test, y_pred))

Accuracy: 0.87


#### another configuration

In [13]:
ppn = Perceptron(max_iter = 2500, 
                 tol      = 0.001, 
                 eta0     = .01, 
                 random_state=0)

# Train the perceptron
ppn.fit(X_train_std, y_train)

# Apply the trained perceptron on the X data to make predicts for the y test data
y_pred = ppn.predict(X_test_std)

# View the accuracy of the model, which is: 1 - (observations predicted wrong / total observations)
print('Accuracy: %.2f' % accuracy_score(y_test, y_pred))

Accuracy: 0.87


In [14]:
print('Classes   : \n', ppn.classes_)
print('Weights   : \n', ppn.coef_)
print('Intercept : \n', ppn.intercept_)
print('Iterations: \n', ppn.n_iter_)

Classes   : 
 [0 1 2]
Weights   : 
 [[-0.0001682   0.02388371 -0.0143683  -0.01599492]
 [-0.01656758 -0.0285519   0.03205237 -0.01866074]
 [-0.00479367 -0.01400454  0.03506419  0.03332276]]
Intercept : 
 [-0.01 -0.01 -0.06]
Iterations: 
 6


#### use early stopping

In [15]:
ppn = Perceptron(max_iter    = 2500, 
                 tol         = 0.001, 
                 eta0        = .001, 
                 random_state= 0,
                 early_stopping     = True,
                 validation_fraction= 0.1,
                 n_iter_no_change   = 5,
                )

# Train the perceptron
ppn.fit(X_train_std, y_train)

# Apply the trained perceptron on the X data to make predicts for the y test data
y_pred = ppn.predict(X_test_std)

# View the accuracy of the model, which is: 1 - (observations predicted wrong / total observations)
print('Accuracy: %.2f' % accuracy_score(y_test, y_pred))

Accuracy: 0.84


In [16]:
ppn.max_iter, ppn.n_iter_

(2500, 6)

In [17]:
ppn.coef_

array([[-0.00181655,  0.0026055 , -0.00274379, -0.00226595],
       [-0.00115216, -0.00217125,  0.00343734, -0.0014662 ],
       [-0.00123626, -0.00094449,  0.00373852,  0.00319898]])