# Logistic Regression

Let’s use the iris dataset, containing the sepal and petal length and width of 150 iris flowers of three different species: Iris-Setosa, Iris-Versicolor, and Iris-Virginica.

**Let’s try to build a classifier to detect the Iris-Virginica type based only on the petal width feature.**

In [None]:
import numpy as np
import matplotlib.pyplot as plt


In [None]:
from sklearn import datasets

# load the datset
iris = datasets.load_iris()
X = iris["data"][:, 3:] # petal width
y = (iris["target"] == 2).astype(np.int) # 1 if Iris-Virginica, else 0

In [None]:
from sklearn.linear_model import LogisticRegression

#train a logistic regression model
log_reg = LogisticRegression()
log_reg.fit(X, y)

Let’s look at the model’s estimated probabilities for flowers with petal widths varying from 0 to 3 cm

In [None]:
X_new = np.linspace(0, 3, 1000).reshape(-1, 1)
y_proba = log_reg.predict_proba(X_new)
plt.plot(X_new, y_proba[:, 1], "g-", label="Iris-Virginica")
plt.plot(X_new, y_proba[:, 0], "b--", label="Not Iris-Virginica")
plt.legend()
plt.xlabel("Petal width (cm)")
plt.ylabel("Probability")

In [None]:
#prediction
print("Class prediction = {}".format(log_reg.predict([[1.7]])))
print("Probability prediction for all classes = {}".format(log_reg.predict_proba([[1.7]])))

The hyperparameter controlling the regularization strength of a Scikit-Learn LogisticRegression model is not $\alpha$ (as in other linear models), but its inverse: **C**.

The higher the value of `C`, the less the model is regularized.

# Softmax Regression

Let’s use Softmax Regression to classify the iris flowers into all three classes.

Scikit- Learn’s LogisticRegression uses *one-versus-all* by default when you train it on more than two classes, but you can set the `multi_class` hyperparameter to "`multinomial`" to switch it to **Softmax Regression** instead.

You must also specify a *solver* that supports Softmax Regression, such as the "`lbfgs`" solver. It also applies $\ell_2$ regularization by default, which you can control using the hyperparameter `C`.

In [None]:
X = iris["data"][:, (2, 3)] # petal length, petal width
y = iris["target"]

In [None]:
softmax_reg = LogisticRegression(multi_class="multinomial",solver="lbfgs", C=10)
softmax_reg.fit(X, y)

So the next time you find an iris with 5 cm long and 2 cm wide petals, you can ask your model to tell you what type of iris it is.

In [None]:
softmax_reg.predict([[5, 2]])

In [None]:
softmax_reg.predict_proba([[5, 2]])

The model answer Iris-Virginica (class 2) with 94.2% probability (or Iris-Versicolor with 5.7% probability)