# Training a Binary Classifier

You need to train a simple classifier
Train a logistic regression in scikit-learn using Logistic Regression

In [1]:
# Load Libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

In [2]:
# Load data with only two classes
iris = datasets.load_iris()
features = iris.data[:100,:]
target = iris.target[:100]

In [4]:
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

In [5]:
# create logistic regression object
logistic_regression = LogisticRegression(random_state=0)


In [6]:
# train model
model = logistic_regression.fit(features_standardized, target)

# Formula

Once the model is trained we can use the model to predict the class of new observation

In [7]:
# Create new observation
new_observation = [[.5,.5,.5,.5]]

In [8]:
# Predict class
model.predict(new_observation)

array([1])

In this example our observation was predicted to be class 1. additionally, we ca see the proability that an observation is a member of each class.

In [9]:
# View Predicted probabilties
model.predict_proba(new_observation)

array([[0.17738424, 0.82261576]])

Our observation had an 17.7% chance of being class 0 and 82.2% chance of being class 1

# Training a Multiclass Classifier

Given more than two classes, you need to train a classifier model
Train a logistic regression in scikit learn with LogisticRegression using one vs rest or multinomial methods

In [22]:
# Load Libraries

from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

In [23]:
# Load data

iris = datasets.load_iris()
features = iris.data
target = iris.target

In [24]:
# Standardize features
scaler = StandardScaler()
features_standardized = scaler.fit_transform(features)

In [25]:
# create one vs rest (ovr) logistic regression object

logistic_regression = LogisticRegression(random_state=0, multi_class='ovr')

In [26]:
# Train model 
model = logistic_regression.fit(features_standardized, target)

In [27]:
# Create new observation
new_observation = [[.5,.5,.5,.5]]

In [28]:
# Predict class
model.predict(new_observation)

array([2])

In [29]:
# View Predicted probabilties
model.predict_proba(new_observation)

array([[0.0387617 , 0.40669108, 0.55454723]])

In [30]:
# create 'multinomial' logistic regression object

logistic_regression = LogisticRegression(random_state=0, multi_class='multinomial')

In [31]:
# Train model 
model = logistic_regression.fit(features_standardized, target)

In [32]:
# Create new observation
new_observation = [[.5,.5,.5,.5]]

In [33]:
# Predict class
model.predict(new_observation)

array([1])

In [34]:
# View Predicted probabilties
model.predict_proba(new_observation)

array([[0.01982185, 0.74491886, 0.23525928]])

# Reducing Variance Through Regularization (for Logistic Regression)

You need to reduce the variance of your logistic regression model

Tune the regularization strength hyperparameter, CV (cross validation)

In [35]:
# Load Libraries
from sklearn.linear_model import LogisticRegressionCV
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

In [36]:
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target

In [37]:
# Standardize features
scaler = StandardScaler()
features_standarized = scaler.fit_transform(features)

In [38]:
# Create decision tree classifier object 
logistic_regression = LogisticRegressionCV(
penalty='l2', Cs=10, random_state=0, n_jobs=-1)

In [39]:
# train model
model = logistic_regression.fit(features_standardized, target)

# Training a clasifier on very large data

You need to train a simple classifier model on a very large set of data.
Train a logistic regression in scikit learn with LogisticRegression using the tochastic average gradient(SAG) solver.

In [40]:
# Load Libraries
from sklearn.linear_model import LogisticRegression
from sklearn import datasets
from sklearn.preprocessing import StandardScaler

In [41]:
# Load data
iris = datasets.load_iris()
features = iris.data
target = iris.target

In [42]:
# Standardize features
scaler = StandardScaler()
features_standarized = scaler.fit_transform(features)

In [43]:
# Create logistic regression object 
logistic_regression = LogisticRegression(random_state=0, solver='sag')

In [44]:
# train model
model = logistic_regression.fit(features_standardized, target)