# Homework 3.1: Normal approximations (35 pts)

<hr />

**a)** Imagine I have a univariate continuous distribution with PDF $f(y)$ that has a maximum at $y^*$. Assume that the first and second derivatives of $f(y)$ are defined and continuous near $y^*$. Show by expanding the log PDF of this distribution in a Taylor series about $y^*$ that the distribution is locally Normal near the maximum.

In performing the Taylor series, how is the scale parameter $\sigma$ of the Normal approximation of the distribution related to the log PDF of the distribution it is approximating?

**b)** Another way you can approximate a distribution as Normal is to use its mean and variance as the parameters as the approximate Normal. We will call this technique "equating moments." Can you do this if the distribution you are approximating has heavy tails, say like a Cauchy distribution? Why or why not?

**c)** Make plots of the PDF and CDF of the following distributions with their Normal approximations as derived from the Taylor series and by equating moments. Do you have any comments about the approximations?

<!-- - Student-t with *µ = 0*, *σ* = 1, and *ν* = 4
- Cauchy with with *µ = 0*, *σ* = 1 -->
- Beta with *α* = *β* = 10
- Gamma with *α* = 5 and *β* = 2


a) We have PDF $f(y)$ with a max at $y^*$ with derivatives $f'(y)$ and $f''(y)$.
\begin{align}
log(f(y)) = \frac{1}{f(y^*)} + \frac{y-y^*}{f'(y^*)} + \frac{(y-y^*)^2}{2f''(y^*)}
\end{align}
Since we know that $y^*$ is a maximum point, so $f'(y^*) = 0$. Thus
\begin{align}
log(f(y)) = \frac{1}{f(y^*)} - \frac{(y-y^*)^2}{2f''^2(y^*)}
\end{align}
 The term $\frac{1}{f(y^*)}$ refers to the location parameter of the distribution, and $-\frac{(y-y^*)^2}{2f''(y^*)}$ refers to the scale parameter

 This scale parameter conforms to the third term of the taylor expansion of the \begin{align} log(f(y))  \end{align} where the function in the log is what we are approximating.

b) We cannot approximate a distribution as normal using its mean and variance for distributions with heavy tails because the tail gets so large that we can no longer calculate first moments for them as a result it is impossible to find an approriate mean and variance for a heavy tail distribution as it cannot be integrated.


c) For a beta distribution with $$α = β = 10$$ we have that based on equating moments its mean  and standard deviation are $$
\begin{align} 
\frac{α}{α + β} &= μ \\[1em] &= .5\\[1em] \frac{α β}{(α+β)^2(α+β+1)} &= σ\\[1em] &=.0119
\end{align}$$

For a Gamma distribution with $$α = 5$$ and $$β = 2$$ we have that based on equating moments its mean  and standard deviation are $$
\begin{align} 
\frac{α}{β} &= μ \\[1em] &= 2.5\\[1em] \frac{α}{β^2} &= σ\\[1em] &= 1.25
\end{align}$$


In [32]:
# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
    cmd = "pip install --upgrade iqplot colorcet datashader bebi103 watermark"
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
    data_path = "https://s3.amazonaws.com/bebi103.caltech.edu/data/"
else:
    data_path = "../data/"
# ------------------------------

import pandas as pd
import numpy as np
import iqplot
import bokeh.io
import scipy.stats as stats
from scipy.misc import derivative
from scipy.interpolate import UnivariateSpline
from scipy.signal import find_peaks

bokeh.io.output_notebook()

This block defines the points and a general plotting definition which will put both plots on the same graph with a proper label and color. This will make it easier to plot everything by just calling the definition.

In [56]:
t1 = np.linspace(0, 1)
t2 = np.linspace(0, 10)

def plot(linspace, function1, function2, function3, title, yaxis):
    
    # We define a figure and the two lines that belong here based on the functions
    p = bokeh.plotting.figure(
        height=400,
        width=600,
        x_axis_label="x",
        y_axis_label=yaxis,
    )
    z2 = p.line(
        x = linspace,
        y = function2
    )

    z1 = p.line(
        x = linspace,
        y = function1,
        color = "orange"
    )

    y_sp = UnivariateSpline(linspace,function3, s=0, k=4)
    y_sp_2d= y_sp.derivative(n=2)
    y= stats.beta.pdf(t1, 10, 10)
    o= (find_peaks(y))
    h= (int(o[0]))
    scaling= ((y-h)**2)/(2*y_sp_2d(h))
    
    z3= p.line(
      x=linspace,
      y=stats.norm.pdf(t1,h, scaling),
      color='red'
    )

    # Plot formatting to add a title and a legend
    p.title = title

    legend= bokeh.models.Legend(items = [
        ('Taylor Approximation', [z2]),
        ('True Distribution', [z1]),
        ('Taylor Approximation', [z3])
    ], location= (0, 265))

    p.add_layout(legend, 'right')
    return p


In [58]:


x=np.linspace(-1,1)

def func(x):
   return x**4

q = bokeh.plotting.figure(
        height=400,
        width=600,
        x_axis_label="x",
        y_axis_label='y',
    )
y= stats.beta.pdf(t1, 10, 10)
y_sp = UnivariateSpline(x,y, s=0, k=4)
y_sp_2d= y_sp.derivative(n=2)
y= stats.beta.pdf(t1, 10, 10)
y= stats.beta.pdf(t1, 10, 10)

o= (find_peaks(y))
h= (int(o[0]))
scaling= -((x-h)**2)/(2*y_sp_2d(h))
print(scaling)
print (h)
print(y_sp_2d(h))
print(stats.norm.pdf(x,h,scaling))

[-0.1811007  -0.18050983 -0.17991993 -0.179331   -0.17874303 -0.17815602
 -0.17756998 -0.17698491 -0.1764008  -0.17581766 -0.17523548 -0.17465427
 -0.17407402 -0.17349474 -0.17291643 -0.17233908 -0.17176269 -0.17118727
 -0.17061282 -0.17003933 -0.16946681 -0.16889525 -0.16832466 -0.16775503
 -0.16718637 -0.16661868 -0.16605195 -0.16548618 -0.16492138 -0.16435755
 -0.16379468 -0.16323278 -0.16267184 -0.16211187 -0.16155287 -0.16099483
 -0.16043775 -0.15988164 -0.1593265  -0.15877232 -0.1582191  -0.15766686
 -0.15711557 -0.15656526 -0.15601591 -0.15546752 -0.1549201  -0.15437364
 -0.15382816 -0.15328363]
24
1725.5593286333071
[nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
 nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan nan
 nan nan nan nan nan nan nan nan nan nan nan nan nan nan]


This just goes ahead and plots all the functions just using the definition above.
The term $\frac{1}{f(y^*)}$ refers to the location parameter of the distribution.

For the both distributions this will become $\frac{1}{f(y^*)}$ about the peak and  $-\frac{(y-y^*)^2}{2f''(y^*)}$ will be the scale parameter.

For the Beta function the scale parameter

In [57]:
t1 = np.linspace(0, 1)
y_spl= UnivariateSpline(t1, stats.beta.pdf(t1,10,10), s=0, k=4)
f_y= y_spl.derivative(n=2)
# Using plot definition to plot everything requested
grid = bokeh.layouts.gridplot([
    [plot(t1, stats.beta.pdf(t1, 10, 10), stats.norm.pdf(t1, 0.5, 0.119), stats.beta.pdf(t1,10,10),"Beta PDF", "F(x)"),
     plot(t1, stats.beta.cdf(t1, 10, 10), stats.norm.cdf(t1, 0.5, 0.119),stats.beta.pdf(t1,10,10),"Beta CDF", "f(x)")],
    [plot(t2, stats.gamma.pdf(t2, 5, scale = 1/2), stats.norm.pdf(t2, 2.5, 1.25),stats.beta.pdf(t1,10,10),"Gamma PDF", "F(x)"), 
     plot(t2,stats.gamma.cdf(t2, 5, scale = 1/2), stats.norm.cdf(t2, 2.5, 1.25),stats.beta.pdf(t1,10,10),"Gamma CDF", "f(x)")]])

bokeh.io.show(grid)

c) My only comments are that Gamma normal approximnations in both pdf and cdf are seems to do a significantly worse job than the equivalent apporoximation for a beta function. I think it is possible that when alpha and beta are the same the function will more closely reflect a normal distribution due to being more symmetrical in many cases.

In [28]:
%load_ext watermark
%watermark -v -p pandas,jupyterlab

Python implementation: CPython
Python version       : 3.7.15
IPython version      : 7.9.0

pandas    : 1.3.5
jupyterlab: not installed

