# Interpreting Coefficients

Coefficients are the "slope" on each of the regression coefficients for **the relation between that x and the y**.

In [1]:
import pandas as pd 
import statsmodels.api as sm 

df = pd.read_csv('../data/housing.csv')

X = df.drop('PRICE',axis=1)
y = df['PRICE']

X = sm.add_constant(X)

sm.OLS(y,X).fit().summary()

0,1,2,3
Dep. Variable:,PRICE,R-squared:,0.734
Model:,OLS,Adj. R-squared:,0.728
Method:,Least Squares,F-statistic:,113.5
Date:,"Tue, 29 Nov 2022",Prob (F-statistic):,2.23e-133
Time:,20:43:04,Log-Likelihood:,-1504.9
No. Observations:,506,AIC:,3036.0
Df Residuals:,493,BIC:,3091.0
Df Model:,12,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,41.6173,4.936,8.431,0.000,31.919,51.316
CRIM,-0.1214,0.033,-3.678,0.000,-0.186,-0.057
ZN,0.0470,0.014,3.384,0.001,0.020,0.074
INDUS,0.0135,0.062,0.217,0.829,-0.109,0.136
CHAS,2.8400,0.870,3.264,0.001,1.131,4.549
NOX,-18.7580,3.851,-4.870,0.000,-26.325,-11.191
RM,3.6581,0.420,8.705,0.000,2.832,4.484
AGE,0.0036,0.013,0.271,0.787,-0.023,0.030
DIS,-1.4908,0.202,-7.394,0.000,-1.887,-1.095

0,1,2,3
Omnibus:,171.096,Durbin-Watson:,1.077
Prob(Omnibus):,0.0,Jarque-Bera (JB):,709.937
Skew:,1.477,Prob(JB):,6.9e-155
Kurtosis:,7.995,Cond. No.,11700.0


In [2]:
# drop 'em

X = X.drop(columns=['INDUS','AGE'])

sm.OLS(y,X).fit().summary()

# Notice some of the coefficients have changed and some of the standard errors have gotten smaller.

# This is normal.

# We'll see why in the next section, but it has to do with **correlated features** (AKA multicollinearity). Removing useless features is good for these reasons.

0,1,2,3
Dep. Variable:,PRICE,R-squared:,0.734
Model:,OLS,Adj. R-squared:,0.729
Method:,Least Squares,F-statistic:,136.8
Date:,"Tue, 29 Nov 2022",Prob (F-statistic):,1.73e-135
Time:,20:43:04,Log-Likelihood:,-1505.0
No. Observations:,506,AIC:,3032.0
Df Residuals:,495,BIC:,3078.0
Df Model:,10,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
const,41.4517,4.903,8.454,0.000,31.818,51.086
CRIM,-0.1217,0.033,-3.696,0.000,-0.186,-0.057
ZN,0.0462,0.014,3.378,0.001,0.019,0.073
CHAS,2.8719,0.863,3.329,0.001,1.177,4.567
NOX,-18.2624,3.565,-5.122,0.000,-25.267,-11.258
RM,3.6730,0.409,8.978,0.000,2.869,4.477
DIS,-1.5160,0.188,-8.078,0.000,-1.885,-1.147
RAD,0.2839,0.064,4.440,0.000,0.158,0.410
TAX,-0.0123,0.003,-3.608,0.000,-0.019,-0.006

0,1,2,3
Omnibus:,172.594,Durbin-Watson:,1.074
Prob(Omnibus):,0.0,Jarque-Bera (JB):,725.971
Skew:,1.486,Prob(JB):,2.28e-158
Kurtosis:,8.06,Cond. No.,11300.0




## Interpretation

To interpret coefficients, you need to understand the actual meaning of the numbers in the data.

Let's get the description of the dataset:
### *house.csv* -  Features and Meanings

    - CRIM     per capita crime rate by town
    - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
    - INDUS    proportion of non-retail business acres per town
    - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
    - NOX      nitric oxides concentration (parts per 10 million)
    - RM       average number of rooms per dwelling
    - AGE      proportion of owner-occupied units built prior to 1940
    - DIS      weighted distances to five Boston employment centres
    - RAD      index of accessibility to radial highways
    - TAX      full-value property-tax rate per 10,000 dollars
    - PTRATIO  pupil-teacher ratio by town
    - LSTAT    percent lower status of the population
    - PRICE    Median value of owner-occupied homes in 1000's dollars
