1\. Binary classification
-------------------------

00:00 - 00:04

You're now ready to learn about binary classification, so let's dive in.

2\. When to use binary classification?
--------------------------------------

00:04 - 00:19

You will use binary classification when you want to solve problems where you predict whether an observation belongs to one of two possible classes. A simple binary classification problem could be learning the boundaries to separate blue from red circles as shown in the image.

3\. Our dataset
---------------

00:19 - 00:32

The dataset for this problem is very simple. The coordinates are pairs of values corresponding to the X and Y coordinates of each circle in the graph. The labels are 1 for red circles and 0 for the blue circles.

4\. Pairplots
-------------

00:32 - 01:00

We can make use of seaborn's pairplot function to explore a small dataset and identify whether our classification problem will be easily separable. We can get an intuition for this if we see that the classes separate well-enough along several variables. In this case, for the circles dataset, there is a very clear boundary: the red circles concentrate at the center while the blue are outside. It should be easy for our network to find a way to separate them just based on x and y coordinates.

5\. The NN architecture
-----------------------

01:00 - 01:05

This is the neural network we will build to classify red and blue dots in our graph.

6\. The NN architecture 2
-------------------------

01:05 - 01:13

We have two neurons as an input layer, one for the x coordinate and another for the y coordinate of each of the red and blue circles in the graph.

7\. The NN architecture 3
-------------------------

01:13 - 01:22

Then we have one hidden layer with four neurons. Four is a good enough number to learn the separation of classes in this dataset. This was found by experimentation.

8\. The NN architecture 4
-------------------------

01:22 - 01:36

We finally end up with a single output neuron which makes use of the sigmoid activation function. It's important to note that, regardless of the activation functions used for the previous layers, we do need the sigmoid activation function for this last output node.

9\. The sigmoid function
------------------------

01:36 - 01:44

The sigmoid activation function squashes the neuron output of the second to last layer to a floating point number between 0 and 1.

10\. The sigmoid function
-------------------------

01:44 - 01:57

You can consider the output of the sigmoid function as the probability of a pair of coordinates being in one class or another. So we can set a threshold and say everything below 0.5 will be a blue circle and everything above a red one.

11\. Let's build it
-------------------

01:57 - 02:06

So let's build our model in keras: We start by importing the sequential model and the dense layer. We then instantiate a sequential model. We add a hidden

12\. Let's build it
-------------------

02:06 - 02:21

layer of 4 neurons and we define an input shape, which consists of 2 neurons. We use the tanh as the activation function, for this hidden layer. Activation functions are covered later in the course, so don't worry about this choice for now. We finally add an output layer

13\. Let's build it
-------------------

02:21 - 02:34

which contains a single neuron, we make use of the sigmoid activation function so that we achieve the behavior we expect from this network, that is obtaining a value between 0 and 1. Our model is now ready to be trained.

14\. Compiling, training, predicting
------------------------------------

02:34 - 03:01

Just as before, we need to compile our model before training. We will use stochastic gradient descent as an optimizer and binary cross-entropy as our loss function. Binary cross-entropy is the function we use when our output neuron is using sigmoid as its activation function. We train our model for 20 epochs passing our coordinates and labels as parameters. Then, we obtain the predicted labels by calling predict on coordinates.

15\. Results
------------

03:01 - 03:07

These are boundaries that were learned to classify our circles. It looks like our model did pretty well!

16\. Let's practice!
--------------------

03:07 - 03:11

It's time to have some fun now with this new architecture you've learned.

Exploring dollar bills
======================

You will practice building classification models in Keras with the **Banknote Authentication** dataset. 

Your goal is to distinguish between real and fake dollar bills. In order to do this, the dataset comes with 4 features: `variance`,`skewness`,`kurtosis` and `entropy`. These features are calculated by applying mathematical operations over the dollar bill images. The labels are found in the dataframe's `class` column.

![](https://assets.datacamp.com/production/repositories/4335/datasets/6ce6fd4fdc548ecd6aaa27b033073c5bfc0995da/dollar_bills.png)

A pandas DataFrame named `banknotes` is ready to use, let's do some data exploration!

Instructions
------------

-   Import `seaborn` as `sns`.
-   Use `seaborn`'s `pairplot()` on `banknotes` and set `hue` to be the name of the column containing the labels.
-   Generate descriptive statistics for the banknotes authentication data.
-   Count the number of observations per label with `.value_counts()`.

In [None]:
# Import seaborn
import seaborn as sns

# Use pairplot and set the hue to be our class column
sns.pairplot(banknotes, hue='class') 

# Show the plot
plt.show()

# Describe the data
print('Dataset stats: \n', banknotes.describe())

# Count the number of observations per class
print('Observations per class: \n', banknotes['class'].value_counts())


A binary classification model
=============================

Now that you know what the **Banknote Authentication** dataset looks like, we'll build a simple model to distinguish between real and fake bills. 

You will perform binary classification by using a single neuron as an output. The input layer will have 4 neurons since we have 4 features in our dataset. The model's output will be a value constrained between 0 and 1. 

We will interpret this output number as the probability of our input variables coming from a fake dollar bill, with 1 meaning we are certain it's a fake bill.

![](https://assets.datacamp.com/production/repositories/4335/datasets/db1c482fd8cb154572c3ce79fe9a406c25ed1a9b/model_chapter2_binary_classification.JPG)

Instructions
------------

-   Import the `Sequential` model and `Dense` layer from tensorflow.keras.
-   Create a sequential model.
-   Add a 4 neuron input layer with the `input_shape` parameter and a 1 neuron output layer with `sigmoid` activation. 
-   Compile your model using `sgd` as an optimizer.

In [None]:
# Import the sequential model and dense layer
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

# Create a sequential model
model = Sequential()

# Add a dense layer
model.add(Dense(1, input_shape=(4,), activation='sigmoid'))

# Compile your model
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])

# Display a summary of your model
model.summary()