## 1 Training a Binary Classifier
### Problem
You need to train a simple classifier model.
### Solution
Train a logistic regression in scikit-learn using LogisticRegression:

In [1]:
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Load data with only two classes
iris = datasets.load_iris()
features = iris.data[:100,:]
target = iris.target[:100]
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

In [2]:
# Create logistic regression object
logistic_regression = LogisticRegression(random_state=0)
# Train model
model = logistic_regression.fit(features_standardized, target)

In [3]:
# Create new observation
new_observation = [[.5, .5, .5, .5]]

In [4]:
# Predict class
model.predict(new_observation)

array([1])

In [5]:
# View predicted probabilities
model.predict_proba(new_observation)

array([[0.17738424, 0.82261576]])

`Our observation had an 17.7% chance of being class 0 and 82.3% chance of being class 1.`

## 2 Training a Multiclass Classifier
### Problem
Given more than two classes, you need to train a classifier model.
### Solution
Train a logistic regression in scikit-learn with LogisticRegression using one-vs-rest
or multinomial methods:

In [6]:
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Create one-vs-rest logistic regression object
logistic_regression = LogisticRegression(random_state=0, multi_class="ovr")
# Train model
model = logistic_regression.fit(features_standardized, target)

In [7]:
# Predict class
model.predict(new_observation)

array([2])

In [8]:
# View predicted probabilities
model.predict_proba(new_observation)

array([[0.0387617 , 0.40669108, 0.55454723]])

## 3 Reducing Variance Through Regularization
### Problem
You need to reduce the variance of your logistic regression model.
### Solution
Tune the regularization strength hyperparameter, C:

In [9]:
# Load libraries
from sklearn.linear_model import LogisticRegressionCV
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Create decision tree classifier object
logistic_regression = LogisticRegressionCV(penalty='l2', Cs=10, random_state=0, n_jobs=-1)
# Train model
model = logistic_regression.fit(features_standardized, target)

In [10]:
# Predict class
model.predict(new_observation)

array([1])

In [11]:
# View predicted probabilities
model.predict_proba(new_observation)

array([[5.96244929e-04, 9.70140320e-01, 2.92634349e-02]])

## 4 Training a Classifier on Very Large Data
### Problem
You need to train a simple classifier model on a very large set of data.
### Solution
Train a logistic regression in scikit-learn with LogisticRegression using the stochas‐
tic average gradient (SAG) solver:

In [12]:
# Load libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Create logistic regression object
logistic_regression = LogisticRegression(random_state=0, solver="sag")
# Train model
model = logistic_regression.fit(features_standardized, target)

## 5 Handling Imbalanced Classes
### Problem
You need to train a simple classifier model.
### Solution
Train a logistic regression in scikit-learn using LogisticRegression:

In [13]:
# Load libraries
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler
iris = datasets.load_iris()
features = iris.data
target = iris.target
# Make class highly imbalanced by removing first 40 observations
features = features[40:,:]
target = target[40:]
# Create target vector indicating if class 0, otherwise 1
target = np.where((target == 0), 0, 1)
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)
# Create decision tree classifier object
logistic_regression = LogisticRegression(random_state=0, class_weight="balanced")
# Train model
model = logistic_regression.fit(features_standardized, target)

`Like many other learning algorithms in scikit-learn, LogisticRegression comes with
a built-in method of handling imbalanced classes. If we have highly imbalanced
classes and have not addressed it during preprocessing, we have the option of using
the class_weight parameter to weight the classes to make certain we have a balanced
mix of each class. Specifically, the balanced argument will automatically weigh classes
inversely proportional to their frequency:`