# Logistic Regression

Despite having “regression” in its name, a logistic regression is actually a widely used binary classifier (i.e. the target vector can only take two values). In a logistic regression, a linear model (e.g $\beta_0 + \beta_1x$ ) is included in a logistic (also called sigmoid) function,
$$f(x)={\frac {1}{1+e^{-z}}}$$
such that
$$ P(y_i = 1|X)={\frac {1}{1+e^{-(\beta_0 + \beta_1x)}}}$$

where $ P(y_i = 1|X)$ is the probability of the $i^{th}$ observation’s target value, $y_i$ being class 1, X is the training data, $\beta_0$ and $\beta_1$ are the parameters to be learned, and $e$ is Euler’s number. In other words, Instead of predicting exactly 0 or 1, logistic regression generates a probability—a value between 0 and 1, exclusive. Logistic regression predicts whether something is true or false instead of predicting something continous. Despite the probability value, it is used for classification: e.g, a mouse is obese or not. This obesity can be predicted by multiple features (weight, Age, etc) LR can work with continous data or discrete data.

### Import Library

In [1]:
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

### Load Iris Flower Dataset

In [2]:
# Load data with only two classes
iris = datasets.load_iris()
X = iris.data[:100,:] # Values [[5.1 3.5 1.4 0.2] [4.9 3.  1.4 0.2] [4.7 3.2 1.3 0.2]...
y = iris.target[:100] # prediction 1 or 0

In [21]:
X.shape, y.shape

((100, 4), (100,))

### Standardize Features

In [3]:
scaler = StandardScaler()
X_std = scaler.fit_transform(X)
print(X_std[0:3])

[[-0.5810659   0.84183714 -1.01297765 -1.04211089]
 [-0.89430898 -0.2078351  -1.01297765 -1.04211089]
 [-1.20755205  0.21203379 -1.08231219 -1.04211089]]


### Create Logistic Regression

In [6]:
# Create logistic regression object
clf = LogisticRegression(random_state=0)

### Train Logistic Regression

In [7]:
# Train model
model = clf.fit(X_std, y)



### Create Previously Unseen Observation

In [8]:
# Create new observation
new_observation = [[.5, .5, .5, .5]]

### Predict Class Of Observation

In [9]:
# Predict class
model.predict(new_observation)

array([1])

### View Predicted Probabilities

In [10]:
# View predicted probabilities
model.predict_proba(new_observation)

array([[0.18944274, 0.81055726]])