<h1 style="font-size:42px; text-align:center; margin-bottom:30px;"><span style="color:SteelBlue">Module 3:</span> Classification Algorithms</h1>
<hr>

Welcome to <span style="color:royalblue">Module 3: Classification Algorithms</span>!

In this module, we'll dive into a few more key concepts for machine learning. In particular, we want to introduce you to 4 algorithms that we'll be using in this project:
1. $L_1$-regularized logistic regression
2. $L_2$-regularized logistic regression
3. Random forests
4. Boosted trees

Just as in the previous project, we'll provide a gentle introduction to the **intuition and practical benefits** of each algorithm.

<br><hr id="toc">

### In this module...

In this module we'll walk through more key machine learning concepts, plus 4 effective algorithms for classification tasks.

1. [Binary classification](#binary)
2. [Toy example: noisy conditional](#conditional)
3. [Logistic Regression](#logistic)
3. [Regularized logistic algorithms](#regularized-logistic) - $L_1$-regularized and $L_2$-regularized
4. [Tree ensemble algorithms](#tree-ensembles) - Random Forests and Boosted Trees

**Tip:** Each section builds on the previous ones.

<br><hr>

### First, let's import libraries that we'll need

In [1]:
# print_function for compatibility with Python 3
from __future__ import print_function
# NumPy and Pandas
import numpy as np

# Matplotlib, and remember to display plots in the notebook
import matplotlib.pyplot as plt
%matplotlib inline
# Seaborn for easier visualization
import seaborn as sns

<span id="binary"></span>
# 1. Binary classification

Classification with 2 classes is so common that it gets its own name: **binary classification.** 


Just to be clear, let's take another look at the **target variable** for this problem.  First, let's look at it in the raw dataset (before we created the analytical base table).

In [None]:
# Print unique classes for 'status' and the first 5 observations for 'status' in the raw dataset


However, when we constructed our analytical base table, we converted the target variable from <code style="color:crimson">'Left' / 'Employed'</code> into <code style="color:crimson">1 / 0</code>.

In [None]:
# Print unique classes for 'status' and the first 5 observations for 'status' in the analytical base table


Which is the **positive** class? How about the **negative** class?

<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>

<span id="conditional"></span>
# 2 - Toy example: noisy conditional

We're going to use another toy example, just as we did in Project 1. 

This time, we're going to build models for a **noisy conditional**.


Let's create that dataset:

In [None]:
# Input feature

# Noise


# Target variable


We need to **reshape** <code style="color:steelblue">x</code> before moving on.
* That's because Scikit-Learn algorithms expect input features with 2 axes. However, right now, <code style="color:steelblue">x</code> only has one.

To make sure it has 2 axes, reshape it to be (100, 1) and name the the reshaped object capital <code style="color:steelblue">X</code>.

In [None]:
# Reshape x into X


Next, plot a **scatterplot** of the synthetic dataset.

In [None]:
# Plot scatterplot of synthetic dataset


<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>

<span id="logistic"></span>
# 3. Logistic regression

First, we'll discuss **logistic regression**, which is the classification analog of linear regression.

Let's actually fit a linear regression model first.

In [None]:
# Import LinearRegression and LogisticRegression


Fit a linear model, make predictions, and plot them.

In [None]:
# Linear model


# Plot dataset and predictions


Next, let's see how **logistic regression** differs.

Let's fit a logistic regression model.

In [None]:
# Logistic regression


Next, let's call the <code style="color:steelblue">.predict()</code> function.

In [None]:
# predict()


Call <code style="color:steelblue">.predict_proba()</code> on the first 10 observations and display the results.

In [None]:
# predict_proba()


Get the predictions for the first observation.

In [None]:
# Class probabilities for first observation


Get the probability of **just the positive class** for the first observation.

In [None]:
# Positive class probability for first observation


Use a simple list comprehension to extract a **list of only the predictions for the positive class**.

In [None]:
# Just get the second value for each prediction


Ok, let's fit and plot the logistic regression model.

In [None]:
# Logistic regression


# Predict probabilities


# Just get the second value (positive class) for each prediction


# Plot dataset and predictions


<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>

<span id="regularized-logistic"></span>
# 4. Regularized logistic regression

Logistic regression has regularized versions that are analogous to those for linear regression.

Just to save ourselves from repeating the same code, let's write a quick helper function that:
1. Fits any classification model
2. Makes predictions
3. Extracts the positive probabilities
4. Plots them

In [None]:
def fit_and_plot_classifier(clf):
    # Fit model
    
    
    # Predict and take second value of each prediction
    
    
    # Plot
    
    
    # Return fitted model and predictions
    

Fit and plot the same logistic regression from earlier, this time using <code style="color:steelblue">fit_and_plot_classifier()</code>.

In [None]:
# Logistic regression
clf, pred = fit_and_plot_classifier(LogisticRegression())

Make the penalty **4 times stronger**.

In [None]:
# More regularization


Next, make the penalty **4 times weaker**.

In [None]:
# Less regularization


To basically remove regularization, bump <code style="color:steelblue">C</code> way up.

In [None]:
# Basically no regularization


Set the **penalty type** to use $L_1$ regularization.

In [None]:
# L1 regularization


Initialize $L_1$-regularized and $L_2$-regularized logistic regression **separately** and **explicitly**.

In [None]:
# L1-regularized logistic regression


# L2-regularized logistic regression


Finally, use $L_1$-regularization with a 4 times weaker penalty.

In [None]:
# L1 regularization with weaker penalty


<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>

<span id="tree-ensembles"></span>
# 5. Tree ensemble algorithms

The same tree ensembles we used for regression can be applied to classification. 

First, import the random forest classifier.

In [None]:
# Import RandomForestClassifier


Apply it to this toy problem.

In [None]:
# Random forest classifier


Next, import the boosted tree classifier.

In [None]:
# Import GradientBoostingClassifier


And finally, apply it to this toy problem.

In [None]:
# Random forest classifier


<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>

### Next Steps

Alright, that was a nice tour through some key theory and concepts, but let's get ready to dive back into the project!

As a reminder, here are a few things you did in this module:
* You learned some key terminology for binary classification, such as "positive" vs. "negative" classes.
* You saw how logistic regression can also be regularized.
* You played around with different settings for penalty strength.
* And you recruited 4 algorithms: $L_1$-Regularized Logistic, $L_2$-Regularized Logistic, Random Forests, and Boosted Trees.

Now that we've recruited our 4 candidate algorithms, it's time to see which one performs the best! In the next module, <span style="color:royalblue">Module 4: Model Training</span>, we'll plug these algorithms into the powerful modeling process you learned in Project 2.

<p style="text-align:center; margin: 40px 0 40px 0; font-weight:bold;">
[Back to Contents](#toc)
</p>