# Stats Models

- Statistical models
- Statistical tests
- Plotting functions
- Complements SciPy stats module
- Built on NumPy and SciPy
- Integrates with Pandas for data handling
- Graphical functions are based on Matplotlib

See: http://www.statsmodels.org/stable/index.html
        
Install:
```pip install -U statsmodels```
or
```conda install -c conda-forge statsmodels```

## Simple example using ordinary least squares

In [2]:
# Simple example using ordinary least squares

import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf
# Load data
dat = sm.datasets.get_rdataset("Guerry", "HistData").data
# Fit regression model (using the natural log of one of the regressors)
results = smf.ols('Lottery ~ Literacy + np.log(Pop1831)', data=dat).fit()
print(results.summary())


                            OLS Regression Results                            
Dep. Variable:                Lottery   R-squared:                       0.348
Model:                            OLS   Adj. R-squared:                  0.333
Method:                 Least Squares   F-statistic:                     22.20
Date:                Mon, 08 Jan 2018   Prob (F-statistic):           1.90e-08
Time:                        03:25:42   Log-Likelihood:                -379.82
No. Observations:                  86   AIC:                             765.6
Df Residuals:                      83   BIC:                             773.0
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                      coef    std err          t      P>|t|      [0.025      0.975]
-----------------------------------------------------------------------------------
Intercept         246.4341     35.233     

In [3]:
# Previous example using numpy arrays instead of formulas

import numpy as np
import statsmodels.api as sm

# Generate artificial data (2 regressors + constant)
nobs = 100
X = np.random.random((nobs, 2))
X = sm.add_constant(X)
beta = [1, .1, .5]
e = np.random.random(nobs)
y = np.dot(X, beta) + e

# Fit regression model
results = sm.OLS(y, X).fit()

# Inspect the results
print(results.summary())


                            OLS Regression Results                            
Dep. Variable:                      y   R-squared:                       0.210
Model:                            OLS   Adj. R-squared:                  0.194
Method:                 Least Squares   F-statistic:                     12.90
Date:                Mon, 08 Jan 2018   Prob (F-statistic):           1.08e-05
Time:                        03:28:19   Log-Likelihood:                -16.977
No. Observations:                 100   AIC:                             39.95
Df Residuals:                      97   BIC:                             47.77
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.5945      0.076     21.119      0.0