# Logistic regression

We can use logistic regression for supervised machine learning on the Iris data set.

Logistic regression is used for classification and gets its name from its use of the logistic function to determine the probability of a data point belonging to one of a particular number of classes.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn import datasets

In [None]:
# import some data to play with
iris = datasets.load_iris()

In [None]:
iris

In [None]:
iris.data

In [None]:
iris.feature_names

In [None]:
iris.target

In [None]:
iris.target_names

In [None]:
X = iris.data[:, :2]  # we only take the first two features.
Y = iris.target

This is the instantiation of our model object.

In [None]:
#logreg = LogisticRegression(C=1e5)
logreg = LogisticRegression()

And again we call the object's fit method to train the model, that is, to run the code that determines what the ideal parameters are for this model.

In this case, we pass in both the features and the target values -- this is a supervised algorithm.

In [None]:
# Create an instance of Logistic Regression Classifier and fit the data.
logreg.fit(X, Y)

Now we do a bit of coding gymnastics so that we can plot the data and the boundaries for classifying the data that have been found by the model.

In [None]:
# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, x_max]x[y_min, y_max].

x_min = X[:, 0].min() - 1
x_max = X[:, 0].max() + 1
y_min = X[:, 1].min() - 1
y_max = X[:, 1].max() + 1

h = .02  # step size in the mesh

xarr = np.arange(x_min, x_max, h)
yarr = np.arange(y_min, y_max, h)

xx, yy = np.meshgrid(xarr, yarr)

We use our model to predict the category at every point on the meshgrid.

In [None]:
Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()])

In [None]:
# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(8, 6))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)

# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')

plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xticks(())
plt.yticks(())

plt.show()