# "Interpretation of Partial regression coefficients."

- author: "<a href='https://www.linkedin.com/in/aneeshdata/'>Aneesh R</a>"
- toc: false
- comments: true
- categories: [Linear Algebra Tutorial]
- badges: false





In [2]:
# TO WORK ALONG

# import pandas as pd
# import numpy as np
# import statsmodels.api as sm 
# import tabulate as tb

# Copy the code of udregression from the link 
# provided below and run it in your environment.

Let $X_1$ be an explanatory variable.   
Let's suppose $X_1$ ~ $exp(\lambda = 1)$  
Let's use numpy and draw 15 random observation from the above distribution and assign the variable X1 to it.


In [3]:
X1 = np.random.exponential(size=15)

Let $X_2$ like $X_1$ be another explanatory variable which follows exponential distribution with mean = 1.   
Let's use numpy to create random vector X2 of size 15 from the required distribution 


In [4]:
X2 = np.random.exponential(size=15)

We know that the error is assumed to follow the standard normal distribution so let's use numpy to generate a random error vector of size 15 from the standard normal distribtuion

In [5]:
error=np.random.normal(size=15) # by default numpy's normal function 
                                # outputs standard normal variates

Let Y be a dependent variable that is generated from X1,X2 and error.   
Let's suppose Y is generated as follows:  
$Y=1.11+123X_1+45X_2+error$   
Therefore the linear model we are considering is $Y=\beta_0+\beta_1X_1+\beta_2X_2+error$ with $\beta_0=1.11,\beta_1=123,\beta_2=45$
Let's interpret beta1 and beta2

In [6]:
Y=1.11+123*X1+45*X2+error

**To interpret $\beta_1$.**  
First, Let's regress Y with $X_2$ 


In [7]:
reg =udregression(X2,Y).fit_model() # don't worry about udregression 
                                    # it is a custom class I wrote 
                                    # and for the code click the below link.

> [Link to udregreesison's source code](https://varadan13.github.io/import_this/linear%20algebra%20tutorial/2021/04/16/dummy-var.html)

Let's now find the residuals of this regression

In [8]:
residual= Y -reg.fittedvalues
residual

array([  -9.91016593,  -94.00822722,   -5.7247282 ,   65.59993044,
         42.23952437,   14.23110086,   36.70187822,  -82.68784607,
        -67.94269339, -105.34239585,  -17.82715355,    4.53419721,
       -106.44550192,   38.80625987,  287.77582115])

Let's now regress the residuals with X1 and inspect the obtained parameter

In [9]:
reg =udregression(X1,residual).fit_model() 
slope = reg.params[1]
slope

122.63790390315614

We can see that the slope is closer to $\beta_1$ and if the observation is large we can check that it will be equal to $\beta_1$. Notice that we obtained $\beta_1$ as slope coefficient while regressing X1 with the residual of the regression of X2 and Y.

Here Residuals act as a measure of the variance of Y that has not been captured by X2.

Regression(residual,$X_1$) is attempting to explain the variance of Y that has not been explained by $X_2$.

From single variable regression the slope parameter has the following interpretation: slope parameter measures the change in Y or dependent variable due to an unit change in the independent variable.

Therefore, $\beta_1$ measures the change in residual due to an unit change in $X_1$. Let's apply the definition of residual on the preceding sentence and rewrite it. $\beta_1$ measures the change in Y from an unit change in $X_1$ after removing the influence of $X_2$ on Y. 

Let's define $\beta_2$ using $\beta_1$'s definition as a template.  

$\beta_2$ measures the change in Y from an unit change in $X_2$ after removing the influence of $X_1$ on Y.

Let's check if it is indeed true.


**To interpret $\beta_2$**

regressing Y with $X_1$ 

In [10]:
reg =udregression(X1,Y).fit_model()

finding the residual

In [11]:
residual = Y - reg.fittedvalues

regressing residual with X2 and finding the slope

In [12]:
reg =udregression(X2,residual).fit_model()
slope=reg.params[1]
slope

45.28046614638129

We can see that the slope is indeed equal to $\beta_2$

These interpretations are however valid as long as the underlying assumptions are approximately valid

For example let's do the following

In [13]:
X2=0.99*X1
Y=1.11+123*X1+45*X2+error # we have created a strongly correlated X1 and X2

In [14]:
reg =udregression(X1,Y).fit_model()
residue = Y - reg.fittedvalues
reg =udregression(X2,residue).fit_model()
slope=reg.params[1]
slope

1.6479439340910673e-14

We can see that the above slope not equal to $\beta_2$ and therefore we cannot use the interpretation of $\beta_2$ we have learnt here.

Conclusion
- $\beta_1$ measures the change in Y from an unit change in $X_1$ after removing the influence of $X_2$ on Y. 
- $\beta_2$ measures the change in Y from an unit change in $X_2$ after removing the influence of $X_1$ on Y.
