## A "Hello World" Example of Machine Learning

Loading the Iris dataset from scikit-learn. 

The first column represents Sepal length, the second column represents Sepal width,  the third column represents the petal length, and the fourth column the petal width of the flower samples. The classes (type of species) are already converted to integer labels where 0=Iris-Setosa, 1=Iris-Versicolor, 2=Iris-Virginica.

Here, we are using only two features: the third and fourth columns. 

In [None]:
from sklearn import datasets
import numpy as np

iris = datasets.load_iris()
type(iris)

In [None]:
iris.data

In [None]:
X = iris.data[:, [2, 3]]

In [None]:
X

In [None]:
y = iris.target
type(y)

In [None]:
y

In [None]:
print('Class labels:', np.unique(y))

Splitting data into 70% training and 30% test data:

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=1, stratify=y)

In [None]:
print('Labels counts in y:', np.bincount(y))
print('Labels counts in y_train:', np.bincount(y_train))
print('Labels counts in y_test:', np.bincount(y_test))

Standardizing the features:

In [None]:
from sklearn.preprocessing import StandardScaler

sc = StandardScaler() #center the distribution around zero (mean), with a standard deviation of 1.
sc.fit(X_train)
X_train_std = sc.transform(X_train)
X_test_std = sc.transform(X_test)

In [None]:
from sklearn.linear_model import Perceptron

ppn = Perceptron(max_iter=40, eta0=0.1, random_state=1)
ppn.fit(X_train_std, y_train)

In [None]:
y_pred = ppn.predict(X_test_std)
print('Misclassified samples: ' + str((y_test != y_pred).sum()))

In [None]:
from sklearn.metrics import accuracy_score

print('Accuracy: ' + str(accuracy_score(y_test, y_pred)))

In [None]:
X_new = [[1.1, 0.2],[0.4, 1.9]]
y_new = ppn.predict(X_new)
y_new