# Classification between Two Classes
In this exercise, we will learn how to perform a binary classification.

## Introduction
You are given data from two classes. In each class the data follows a distribution out of one or many gaussian distributions with class dependent parameters. Your task is to build a model which can classify between the two classes.

## Imports and Seeding
First we will do the necessary imports:
* `numpy` for general data handling and array manipulation
* `tensorflow` to build and train the regression model
* `matplotlib.pyplot` for plotting
* `sklearn.utils.shuffle` to randomly shuffle our training dataset
* `cycler.cycler` helps with plotting multiple distributions

In [1]:
import numpy as np
from matplotlib import pyplot as plt
import tensorflow as tf
from sklearn.utils import shuffle
from cycler import cycler

Then we set a random seed for the `np.random` module. This makes our code reproducible as the random operations will yield the same results in every run through the notebook.

In [2]:
np.random.seed(42)

## Data Creation
First we will create the data.

To make things a little bit more interesting we have written a small piece of code, which creates `N_DIM` dimensional data following distributions consiting of one or more (`N_PEAK`) different Gaussian functions. Increasing the number of `N_PEAK` will in general make the distributions and thus the task more complex.

In [3]:
N_DIMS = 1
N_PEAK = 1
SCALE = 0.1

centers_class_1 = np.random.uniform(0, 1, size=(N_PEAK, N_DIMS))
centers_class_2 = np.random.uniform(0, 1, size=(N_PEAK, N_DIMS))

def make_samples(centers, n_samples=1_000):
    output = []
    for i, center in enumerate(centers):
        output.append(np.random.normal(
            loc=center,
            scale=SCALE,
            size=(n_samples // N_PEAK, N_DIMS)
        ))
    return np.concatenate(output)

class_1 = make_samples(centers_class_1, 100_000)
class_2 = make_samples(centers_class_2, 100_000)

## Data Visualization
Visualize the data. When looking at one dimension (`N_DIMS=1`) a single histogram will solve the task. If plotting many dimensions (`N_DIMS>1`) you may want to plot 1-dimensional projections onto each of the axes.

In [14]:
"""
TODO: Visualize the data of the two classes.
Wou may want to plot all 1-dimensional projections of the `N_DIM` dimensions of our data.
"""

'\nTODO: Visualize the data of the two classes.\nWou may want to plot all 1-dimensional projections of the `N_DIM` dimensions of our data.\n'

## Data Preparation
Next we prepare the training data.
We built one dataset made out of both classes.

The `x` values of the training data are given by the distributions themselves

For the `y` values we use ones for `class_1` and zeros for`class_2`.

In [5]:
x = np.concatenate((class_1, class_2))
y = np.concatenate((np.ones(len(class_1)), np.zeros(len(class_2))))[..., None]

Next we suffle our dataset. This prevents the case that during training the network only sees one type of events in a particular or even many subsequent batches.

In [6]:
x, y = shuffle(x, y, random_state=0)

## Model Creation
Next we will create the model.
- What is a suitable size?
- How many inputs and outputs does the model need?
- What are suitable activations?
    - Hint: Think about the activation of the last layer.

In [15]:
"""
TODO: Create the model
"""

'\nTODO: Create the model\n'

Now compile the model:
- Which loss function should be used? ([Documentation](https://www.tensorflow.org/api_docs/python/tf/keras/losses))
- Which optimizer should be used?

In [16]:
"""
TODO: Compile the model
"""

'\nTODO: Compile the model\n'

Next we inspect our model. How many parameteres does it have?

In [17]:
"""
TODO: Use model.summary() to look at number of parameters
"""

'\nTODO: Use model.summary() to look at number of parameters\n'

## Model Training
Now train the model:
* What is a suitable number of epochs?
* What is a suitable size for the batches?

In [18]:
"""
TODO: Train the model
"""

'\nTODO: Train the model\n'

## Model Evaluation
Visualize the model prediction. Describe your observation.

In [19]:
"""
TODO: Prepare data for the model evaluation/prediction
"""

'\nTODO: Prepare data for the model evaluation/prediction\n'

In [20]:
"""
TODO: Perform the model evaluation/prediction
"""

'\nTODO: Perform the model evaluation/prediction\n'

In [21]:
"""
TODO: Visualize the model evaluation/prediciton
"""

'\nTODO: Visualize the model evaluation/prediciton\n'

## Futher Tasks
Now we will make our exercise more difficult:
* Make the functions more complex (N_PEAK) and train the classifier again. Describe your observations.
* Raise the number of dimensions (N_DIM) to 2 (and 10) and train the classifier again. Describe your observations.

## Summary
This concludes our tutorial on the Classification between Two Classes.

In this tutorial you have learned:
* How to visualize n-dimensional data distributions from two classes
* How to prepare the data for a classification
* How to create a neural network for a classification
* Which loss to use for a classification
* How to interpret the output of a classification network
* The strenghts and limits of a simple network and a simple optimizer according to:
    * Number of dimensions
    * Complexity of functions