# Gamma Regression Implementation

## Preliminaries

### Imports

In [1]:
import numpy as np
import numpy.random as random

import matplotlib.pyplot as plt

from scipy.special import factorial
from sklearn.linear_model import LinearRegression
import scipy.stats as stats
import scipy.optimize as optimize
import scipy.special as special

import pandas as pd

import sys

%matplotlib inline

### Random Seed

In [2]:
seed=567
np.random.seed(seed)

## Gamma Regression

When $Y\in\mathbb{R}^+$, ie $y>0$ is   positive real numbers it is  natural to assume that $Y|_X$ follows a $\Gamma$   distribution. For simplicity we will assume $\alpha$ is a fixed **known** parameter. 

$$
    p(y|x) = \frac{\beta^\alpha}{\Gamma(\alpha)}s^{\alpha-1}e^{-\beta s} = \frac{s^{\alpha-1}}{\Gamma(\alpha)}e^{ -y\beta +  \alpha\log \beta }
$$  
were $\alpha>0$ and $\beta>0$

And a non-canonical log link

$$
    \eta(x) = -e^{-b_0 - W_0 x}
$$



### Generate random sample data

<div class="alert alert-block alert-info"> Problem 0 </div>
Generate some random data to test model

In [3]:
alpha=5.0

The true parameters used to generate the data are
$$
    \theta_0=(b_0,w_0)=(0,1)
$$

In [4]:
b0=0
W0=1

theta0=np.array([b0,W0])
theta0.shape

(2,)

We generate $N=30$ samples, with $X\sim \mathcal{U}(0,4)$.

In [5]:
N=30
X=np.random.uniform(0,4,(N,1))
X.shape

(30, 1)

In [6]:
X1=np.c_[np.ones(len(X)),X]

b=np.exp(np.dot(X1,theta0))
Y=stats.gamma.rvs(a=alpha,scale=b)
Y.shape

(30,)

<div class="alert alert-block alert-info"> Problem 0.1 </div>
Make a scatter plot of `X` vs `Y`

## Fit linear model

<div class="alert alert-block alert-info"> Problem 1.0 </div>
To demonstrate what goes wrong with a linear least squares model, fit the data to a linear regression.



<div class="alert alert-block alert-info"> Problem 1.1 </div>
Make a line plot of the linear fit over the the test points `x_test`
superimposed over the $X,Y$ input data.

In [7]:
x_test=np.linspace(0,4,201).reshape(-1,1)

## Gamma Regression

There are specialized (re-weighed least square methods) to solve Generalized linear models, but here we just use the standard `scipy.optimize.minimize` routine to fit a Gamma Regression model to the data.

### Link Function

<div class="alert alert-block alert-info"> Problem 2.1 </div>
Using the results you worked out on written homework 5
write the function
$$
    \hat{y}(x_1;\alpha,\theta)
$$
where $\theta=(b,w)$

and $x_1$ is a 2 dimensional vector  with a first column of ones $x_1=(1,x)$

<div class="alert alert-block alert-info"> Problem 2.2 </div>
Using the *true parameters*
$$
    \theta_0=(b_0,w_0)
$$
Superimpose a plot of $\hat{y}(x,\alpha,\theta_0)$ to the plots of the linear fit 
and the problem data $X,Y$


### Loss and Loss Gradient

<div class="alert alert-block alert-info"> Problem 3.1 </div>
Write the max likelihood loss for the Gamma Regression problem with log link

$$
E^{\textrm{log}}_{\textrm{Gam}}(\theta;X_1,Y,\alpha)
$$

where $X_1$ is a $2\times N$ matrix and $Y$ is a $N$ vector of input data

In [8]:
def GammaError(theta,X1,Y,alpha):
    pass

<div class="alert alert-block alert-info"> Problem 3.2 </div>
Write the gradient max likelihood loss for the Gamma Regression problem with log link

$$
\frac{\partial}{\partial \theta_d} E^{\textrm{log}}_{\textrm{Gam}}(\theta;X_1,Y,\alpha)
$$


In [9]:
def GammaErrorGradient(theta,X1,Y,alpha): 
    pass

<div class="alert alert-block alert-info"> Problem 3.3 </div>
Use the function `scipy.minimize.check_gra` to verify that the gradient function is 
implemented correctly at the point 
$$
    \theta_{\textrm{test}} = (-1,1.5)
$$


In [10]:
theta_test=np.array([-1,1.5])

### Optimization

<div class="alert alert-block alert-info"> Problem 4.1 </div>
Fill in the `fit` and `predict` methods in the class below.

Inside the `fit` method:
2. Initilize the initial guess for $\theta$ at random.
1. Use `scipy.optimize` with method `bfgs` to find the optimal parameters.

In [11]:
class GammaRegression:
    def __init__(self,alpha):
        self.alpha=alpha
    def fit(self,X,Y):
        pass
    def predict(self,X):
        pass

<div class="alert alert-block alert-info"> Problem 4.1 </div>
Fit the model to  the `X`, `Y` data.

What are the fitter parameters?

The Gamma model fits the data much better

<div class="alert alert-block alert-info"> Problem 4.2 </div>
Superimpose plots of
1. fitted model predictions for `x_test`
2. $\hat{y}(x_{\textrm{test}},\alpha,\theta_0)$ for the true model
3. The linear model fit 
4. The problem data $X,Y$


## Dependence on $\alpha$

Keeping $X$ and $Y$ Fixed

<div class="alert alert-block alert-info"> Problem 5.1 </div>
How do the fitted parameters $\hat{b}$ and $\hat{w}$ depend on the parameter $\alpha$ used for fitting?

<div class="alert alert-block alert-info"> Problem 5.1 </div>
How do the fitted predicted levels of $Y$ depend the parameter $\alpha$ used for fitting?

<div class="alert alert-block alert-info"> Problem 5.2 </div>
Can you explain why this happens based on the expression for $E^{\textrm{log}}_{\textrm{Gam}}$ that
you worked out in the written homework?