# Lorentzian vs. Gaussian
Use tensorflow.keras to build a simple binary
classifier that can distinguish between
one-dimensional Lorentzians $L(x)$ and Gaussians
$N(x)$ where:
\begin{equation}
    L(x)=\frac{A}{(x-\mu)^2\sigma^2+1}\\
    N(x)=Ae^{(x-\mu)^2/(2\sigma^2)}
\end{equation}
Here A is the amplitude, $\mu$ the location of the peaks, and $\sigma$ the width. You are free to choose any
values for these you like, but I suggest you start simple.

In [2]:
# %load ./include/header.py
import numpy as np
import matplotlib.pyplot as plt
import sys
sys.path.append('./include')
import ml4s
%matplotlib inline
%config InlineBackend.figure_format = 'svg'
plt.style.use('./include/notebook.mplstyle')
np.set_printoptions(linewidth=120)
ml4s._set_css_style('./include/bootstrap.css')
colors = plt.rcParams['axes.prop_cycle'].by_key()['color']
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import datetime
from tqdm import trange,tqdm

## Part a
Write a function that can generate your dataset. This should include a large number of random
functions $L$ and $N$ on the line -5$\leq x \leq$ 5 where A, $\sigma$, $\mu$ are uniformly distributed random
numbers as seen in the figure. The input values will be either $N(x)$ or $L(x)$ while the targets
are 1 for Gaussian and 0 for Lorentzian.

In [None]:
def dataset(size):
    A=tf.keras.initializers.RandomUniform(minval=-5.0, maxval=5.0, seed=None)
    σ=tf.keras.initializers.RandomUniform(minval=-5.0, maxval=5.0, seed=None)
    μ=tf.keras.initializers.RandomUniform(minval=-5.0, maxval=5.0, seed=None)
    x=np.linspace(-5,-5, 1000)
    choice=tf.keras.initializers.RandomUniform(minval=-1, maxval=1, seed=None)
    spot=0
    y=np.zeros(len(x))
    if choice>=0:
        target[spot]=1
        spot+=1
        for i in len(x):
            y[i]=A/(((x[i]-μ)**2)*(σ**2)+1)
    else:
        target[spot]=0
        spot+=1
        for i in len(x):
            holder=(x[i]-μ)**2/(2*σ**2)
            y[i]=A*np.exp(holder)
    
    return target

## Part b
Using the binary cross entropy as your cost function, train your network to distinguish between
them. What level of accuracy can you achieve with 10 functions? 100? 1000?

In [None]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(rectangles, labels, test_size=0.1)

model = keras.Sequential(
[
    layers.Dense(256,input_shape=(L*L,),activation='relu'),
    layers.Dense(128,input_shape=(L*L,),activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(256,input_shape=(L*L,),activation='relu'),
    layers.Dense(4,input_shape=(L*L,),activation='relu'),
    layers.Dense(1, activation='sigmoid', bias_initializer='zeros')
])

In [None]:
model.compile(loss=keras.losses.binary_crossentropy, optimizer='adam', metrics=['accuracy']) 
batch_size = 100
epochs = 30
training_history = {}
training_history['test'] = model.fit(x_train,y_train, epochs=epochs,validation_data=(x_test,y_test), verbose=1)

In [None]:
# evaluate model
score = model.evaluate(x_test, y_test, verbose=2)

predictions = np.zeros(rectangles.shape[0],dtype=int)
predictions[np.where(model(rectangles)>=0.5)[0]] = 1

mistakes = np.where(labels != predictions)[0]
num_mistakes = len(mistakes)

print(f'Num. Mistakes  = {num_mistakes}')

## Part c
How does the performance of your deep neural network compare to logistic
regression? You can either write your own code to do this, or use some libraries (like
scikit.learn).