# `statmodels` Package
---

In this Notebook, I will go through an example of the paper _Statsmodels: Econometric and Statistical Modeling with Python_, in order to understand approximately how the library works.

In [3]:
import statsmodels.api as sm # the suggested convention for importing 'statsmodels'
import numpy as np # we also need numpy when using this library

### <u>Load the data</u>:

In [4]:
longley = sm.datasets.longley # this is Longley (1967) dataset on the US macro economy

# Link on Google-Scholar: https://scholar.google.com/scholar?hl=de&as_sdt=0%2C5&q=An+Appraisal+of+Least+Squares+Programs+for+the+Electronic+Computer+from+the+Point+of+View+of+the+User&btnG=

In [6]:
data = longley.load() # just load the data (it contains other stuff, such as "copyright" etc...)

### <u>Demo of an (OLS-) Regression</u>:

In [7]:
data.exog = sm.add_constant(data.exog) # add exogenous variables (here, we assume that the data is already cleaned!)

In [11]:
longley_model = sm.OLS(data.endog, data.exog) # instantiate the model

In [12]:
longley_res = longley_model.fit() # fit the model --> the output of the 'fit'-method returns a "RegressionResult"-Class

# "longley_res" has several attributes and methods of interest, like - for example - the "params()"-method, which will print out
# the beta-coefficients of the linear regression

In [15]:
longley_res.params

array([-3.48225863e+06,  1.50618723e+01, -3.58191793e-02, -2.02022980e+00,
       -1.03322687e+00, -5.11041057e-02,  1.82915146e+03])

## Nice to know
---

### <u>List of all methods, that you can use for `RegressionResult`-Classes</u>:

All of the attributes and methods are **well-documented in the docstring and in our online documentation**.

In [None]:
['HC0_se', 'HC1_se', 'HC2_se', 'HC3_se', 'aic', 'bic', 'bse', 
 'centered_tss', 'conf_int', 'cov_params', 'df_model', 'df_resid', 
 'ess', 'f_pvalue', 'f_test', 'fittedvalues', 'fvalue', 'initialize', 
 'llf', 'model', 'mse_model', 'mse_resid', 'mse_total', 'nobs', 
 'norm_resid', 'normalized_cov_params', 'params', 'pvalues',
 'resid', 'rsquared', 'rsquared_adj', 'scale', 'ssr',
 'summary', 't', 't_test', 'uncentered_tss', 'wresid']

In [13]:
params = np.dot(np.linalg.pinv(data.exog), data.endog) # By default, the least squares models use the pseudoinverse to
# compute the parameters that solve the objective function