# Econometric 322 Lab \#4

Collaboration Policy

    1. Study groups are allowed but I expect students to understand and complete their own 
    assignments and to hand in one assignment per student.
    2. If you worked in a group, please put the names of your study group on your assignment 
    on top.
    3. Just like all other classes at Rutgers, the student Honor Code is taken seriously.
    
    The submitted assignmenst must be your work.

## <font color = blue> Assignment </font>

Use the water consumption data to estimate a simple regression model.  The water consumption data was introduced at the beginning of the semester and is available on Sakai.  The unknown parameters of a demand function have to be estimated.  Estimate a simple OLS model real per capita water consumption as a function of the real price per gallon.  No other variables are to be used since the purpose of this lab is just to have you become familiar with commands.

# <font color = red> Documentation </font>

## <font color = blue> Abstract </font>

*In this lab, I learned that per capita water consumption is statistically related to the real price per gallon of water. According to the OLS model estimated, for every 1 dollar increase in price per gallon, per capita consumption would decrease by 8425.809311. This is in line with theory, that higher prices entail lower quantity bought. The relationship is not perfectly linear; there are residuals, as there may be other factors that may affect demand for water not explained by the model. *

## <font color = blue> Data Dictionary </font>

| Variable | Values   | Source | Mnemonic |
|----------|----------|--------|---------|
| Aggregate Consumption | Millions of gallons, annual | Int'l Bottled Water | agg_consumption |
| Aggregate Revenue | Millions of dollars, annual/nominal | IBID. | agg_revenue |
| Per Capita Consumption | Gallons per person, annual | Calculated: agg_consumption/pop | per_capita_consump |
| Nominal Price per Gallon | Nominal dollars | Calculated: agg_revenue/agg_cons. | price |
| Real Disposable Income per Capita | Real dollars, base = 2005, annual | Economic R. of Pres. 2010, Tbl. B-31 | real_dis_income |
| Food CPI | Index (Total Food & Beverages) | Economic R. of Pres. 2010, Tbl. B-60 |food_cpi |
| Population | Millions | Economic R. of Pres. 2010, Tbl. B-34 | pop |
| Real Price per Gallon | Real dollars, annual | Calculated: price/food_cpi | real_price  |

# <font color = red> Tasks </font>

## <font color = blue> Load the Pandas and Statsmodels packages and give them aliases.  I recommend 'pd' and 'sm'.  You will also need the Statsmodels formula API for formulas.  Please see Lesson \#4 for examples.</font>

In [None]:
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf 
from statsmodels.iolib.summary2 import summary_col
import matplotlib.pyplot as plt
import seaborn as sns

## <font color = blue> Import the water consumption data.  Set the row index to the years. </font>

In [None]:
df=pd.read_csv("water.csv")
df.set_index("obs")

## <font color = blue> Print the first five (5) records. </font>

In [None]:
df.head()

## Graph

In [None]:
ax = df.plot( x = 'real_price', y = 'per_capita_cons', legend = False, kind = 'scatter' )
ax.set( xlabel = 'Real Price', ylabel = 'Per Capita Consumption', title = 'Real Price vs. Per Capita Consumption' )

## <font color = blue> Estimate an OLS model using per capita consumption as the dependent variable and real price as the the independent variable.  Display the summary report.  See Lesson \#4 for an example.</font>

In [None]:
formula="per_capita_cons ~ real_price"
mod = smf.ols( formula, data = df )
reg01 = mod.fit()
reg01.summary()

## <font color = blue> Retrieve and display the estimated parameters.  See Lesson \#4 for an example.</font>

### Params

In [None]:
reg01.params

### Sum of Squares Residuals

In [None]:
sse = reg01.ssr
sse

### Standard Error

In [None]:
se_reg = np.sqrt( sse/( reg01.nobs - 2 ) )
round( se_reg, 2 )

### SSX

In [None]:
sxx = ( ( df.real_price - df.real_price.mean() ) **2 ).sum()
print( sxx )

### Standard Error of the Beta 1 Estimate

In [None]:
se_beta_1 = se_reg/np.sqrt( sxx )
se_beta_1

### Sum of Residuals

In [None]:
round( reg01.resid.sum(), 4 )