[Seaborn](https://seaborn.pydata.org/) is a Python data visualization library. It provides a high-level interface for drawing attractive and informative statistical graphics.

[`scipy.optimize.minimize`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.minimize.html) minimizes scalar function of one or more variables.

In [None]:
import math
!pip install latexify-py
import latexify
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import scipy.optimize as so
seed = 0
np.random.seed(seed)

## Load the data and take a sneak peek

In [None]:
filename="Lab_2_data.csv"
# Loading the running speed dataset and drop "Unnamed: 0". Feature "age" is in [yrs] and "pace" in [min/km]
data =
data.

## 1D Distribution Plots

A histogram aims to approximate the underlying probability density function that generated the data by binning and counting observations.

In [None]:
# Histogram
sns.
plt.show()

Kernel density estimation (KDE) presents a different solution to what histogram does. Rather than using discrete bins, a KDE plot smooths the observations with a Gaussian kernel, producing a continuous density estimate

In [None]:
# Histogram and Density
sns.
plt.show()

In [None]:
# Histogram and Density - specify # bins
sns.
plt.show()

## 2D Joint Plots

In [None]:
# Joint distribution plot
sns.
plt.show()

In [None]:
# Joint distribution plot with estimated density
sns.
plt.show()

In [None]:
# Joint distribution plot with regression line
sns.
plt.show()

In [None]:
# Joint distribution plot with hexagons
sns.
plt.show()

Please peruse [Visualizing Distributions of Data](https://seaborn.pydata.org/tutorial/distributions.html#), and try your hands on different options you see in there.

## Maximum Likelihood Example - Laplace Distribution

In [None]:
# Generate data
n = 10001
normalData = pd.DataFrame({"NormalData":
laplaceData = pd.DataFrame({"LaplaceData":

# concatenation
df =
df.tail()

In [None]:
# Plot
sns.kdeplot(df['NormalData'], fill=True, color='Blue', label='Normal')
sns.kdeplot(df['LaplaceData'], fill=True, color='Orange', label='Laplace')
plt.xlabel('Data')
plt.legend()
plt.show()

In [None]:
#Zoom in
sns.kdeplot(df['NormalData'], fill=True, color='Blue', label='Normal')
sns.kdeplot(df['LaplaceData'], fill=True, color='Orange', label='Laplace')
plt.xlabel('Data')
plt.xlim(2,6)
plt.ylim(0,0.1)
plt.legend()
plt.show()
# You see more extreme values with Laplace even with the same variance. The Laplace
# distribution is useful for modeling data where more extreme values are expected.

The likelihood of the data is the product of the probability density function evaluated at each one of the data points. So, we really want to know what is the probability density function of the [Laplace distribution](https://en.wikipedia.org/wiki/Laplace_distribution).

Negative, because Scipy gives us a minimization function. So we are going to minimize the negative log likelihood to get the maximum log likelihood. Recall: We do log to avoid getting underflow to zero.

In [None]:
# Laplace Negative Log Likelihood
def laplaceNegLogLikelihood(mu, b, y):
    neg_log_lik =
    return neg_log_lik

# Let's latexify the Laplace Negative Log Likelihood function
@latexify.with_latex
def L(mu, b, y):
    return -(-log(2*b)-(1/b)*sum(abs(y-mu)))
L

Develop a maximum likelihood function that takes a set of data and produces the maximum likelihood estimate for the mean of those data:

In [None]:
## Laplace Maximum Likelihood Estimate for mu
def maximumLikelihood(y):
    RES = so.minimize()
    return RES.x

# Test the function
maximumLikelihood(df.LaplaceData.values)

which is very close to zero set for the original data.

In [None]:
# Minimizing the sum of absolute differences should be the same as median:
round(df.LaplaceData.median(),4)

In [None]:
# Laplace Negative Log Likelihood for regression
def laplaceRegNegLogLikelihood():
    mu =  # compute the mean that we would have if assumed that the form of the function of the mean is beta times X
    return

So, now if we want the maximum likelihood estimated at the regression coefficients we are going to write a function that takes in the X's and the Y's and then optimizes `laplaceRegNegLogLikelihood` to find the betas that are the maximum likelihood estimates for vector beta.  

In [None]:
# Function to maximize regression log likelihood
def maximumRegLikelihood():
    nrows, ncols =
    betas =
    RES = so.minimize()
    print(RES.x)
    return RES.x

Let's test this out:

In [None]:
run_data = pd.read_csv(filename).drop("Unnamed: 0", axis=1)
run_data.head()

In [None]:
x_train =
X_train =
y_train =

betas =

In [None]:
x_new =
X_new =
y_predicted =

fig, ax = plt.subplots()
fig.set_size_inches(12, 7)
plt.scatter(x_train, y_train, c='blue', label='Training Data')
plt.plot(x_new, y_predicted, c='orange', label='Maximum Likelihood Regression')
plt.legend()
plt.xlabel("Age [Years]")
plt.ylabel("Pace [Minutes/Kilometer]")
plt.show()

Let's compare the data actual distributions against some ideal distributions:

In [None]:
data = run_data['age']

fig, ax = plt.subplots()
fig.set_size_inches(12, 7)

sns.kdeplot(data, fill=True, color='Orange', label='Actual Distribution')
ylim = ax.get_ylim()
plt.vlines(data.median(), ylim[0], ylim[1], color='green', label='Actual Distribution: Median Location')
plt.vlines(data.mean(), ylim[0], ylim[1], color='red', label='Actual Distribution: Mean Location')

sns.kdeplot(np.random.laplace(data.mean(), data.std(), 100000), fill=True, color='Green', label='Ideal Laplace Distribution')
sns.kdeplot(np.random.normal(data.mean(), data.std(), 100000), fill=True, color='Blue', label='Ideal Normal Distribution')

plt.xlim(0,100)
plt.ylim(ylim)
plt.legend()
plt.show()

In [None]:
data = run_data['pace']

fig, ax = plt.subplots()
fig.set_size_inches(12, 7)

sns.kdeplot(data, fill=True, color='Orange', label='Actual Distribution')
sns.kdeplot(np.random.laplace(data.mean(), data.std(), 100000), fill=True, color='Green', label='Ideal Laplace Distribution')
ylim = ax.get_ylim()
sns.kdeplot(np.random.normal(data.mean(), data.std(), 100000), fill=True, color='Blue', label='Ideal Normal Distribution')

plt.vlines(data.median(), ylim[0], ylim[1], color='green', label='Actual Distribution: Median Location')
plt.vlines(data.mean(), ylim[0], ylim[1], color='red', label='Actual Distribution: Mean Location')
plt.xlim(0,10)
plt.ylim(ylim)
plt.legend()
plt.show()

Which ideal distribution is a better choice to represent the actual data?