In [13]:
from IPython.display import HTML
HTML('''<script>
code_show=true;
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} $( document ).ready(code_toggle);
</script>  <form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')

    
<h1 style="text-align: center; color: purple;" markdown="1">Econ 320 Python: Lab8 Presentation of Results </h1>
<h2 style="text-align: center; color: purple;" markdown="1">Handout # 8 </h2>

 


# Presentation of results: Stargazer Package for tables

After seeing the motivating video about how important and cool stargazer is here are a few things you need to know when using stargazer 

**knitr options when using stargazer**
When using the startgazer package you need to use the option results='asis' in the knitr options of your chunk of code. That way you results will print nicely in the html document. 

**Do not get confused by the output on Rmd**

The stargazer package and function allows you to make tables in differnet types, such as, LaTex, text, or html.
You need to specify the type in the type argument of the stargazer table, depending on the type you need to change the knirt options to see results looking neat when knitting. 



In order to see your results before knitting you might want to use type text, but to show your results in html you need to use html type. When you use html and you run your code your results print html code that looks like *crazy* text for you if you do not know html code. Do not panic, it will look beautiful in html, just  wait until you knit it.



#### The package setup

In [1]:
import wooldridge as woo
import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from stargazer.stargazer import Stargazer
import statsmodels.formula.api as smf

## Summary statistics 

Below different summary statistics tables to described your data.

`Ipython` allows to display any pandas table in a nice html format.

Then using the `pandas.DataFrame.describe()` we can visualize a nice summary statistics table

In [2]:

# load data named <dataset> and save it in an object with the same name
meap93set = woo.dataWoo('meap93')[['salary', 'benefits', 'enroll', 'droprate']]
print(meap93set.describe())

             salary      benefits        enroll    droprate
count    408.000000    408.000000    408.000000  408.000000
mean   31774.507353   6463.428922   2663.806373    5.066422
std     5038.303826   1456.337659   2696.820560    5.485072
min    19764.000000      0.000000    212.000000    0.000000
25%    28185.500000   5536.500000   1037.500000    1.900000
50%    31266.000000   6304.500000   1840.500000    3.700000
75%    34499.750000   7228.000000   3084.750000    6.500000
max    52812.000000  11618.000000  16793.000000   61.900002


If you would like to have a latex representation of your table of summary statistics, you can use the `pandas` method `DataFrame.to_latex`.

In [3]:
meap93set.describe().to_latex()


'\\begin{tabular}{lrrrr}\n\\toprule\n & salary & benefits & enroll & droprate \\\\\n\\midrule\ncount & 408.000000 & 408.000000 & 408.000000 & 408.000000 \\\\\nmean & 31774.507353 & 6463.428922 & 2663.806373 & 5.066422 \\\\\nstd & 5038.303826 & 1456.337659 & 2696.820560 & 5.485072 \\\\\nmin & 19764.000000 & 0.000000 & 212.000000 & 0.000000 \\\\\n25% & 28185.500000 & 5536.500000 & 1037.500000 & 1.900000 \\\\\n50% & 31266.000000 & 6304.500000 & 1840.500000 & 3.700000 \\\\\n75% & 34499.750000 & 7228.000000 & 3084.750000 & 6.500000 \\\\\nmax & 52812.000000 & 11618.000000 & 16793.000000 & 61.900002 \\\\\n\\bottomrule\n\\end{tabular}\n'

## Regression Output 

A model of the trade off between salary and pensions for teachers. 
$$Log(salary)= \beta_0 + \beta_1(benefits/salary)+ other factors + u $$
The example below load data, generates the new variable `b_s=(benefits/salary)` and then runs three regressions with different sets of `other factors`. The **stargazer command us used in different ways to report the results in a publishable format. See different examples. 

Try the chunk of code below as it is and adding warning = FALSE to the options

In [4]:
data = woo.dataWoo('meap93')
data['b_s']=data['benefits']/data['salary']

In [5]:
model1= smf.ols(formula='np.log(salary) ~ b_s', data=data).fit()
model2= smf.ols(formula='np.log(salary) ~ b_s + np.log(enroll) + np.log(staff)', data=data).fit()
model3= smf.ols(formula='np.log(salary) ~ b_s + np.log(enroll) + np.log(staff)+ droprate + gradrate', data=data).fit()


In [6]:
st=Stargazer([model1, model2, model3])
from IPython.core.display import HTML
HTML(st.render_html())




0,1,2,3
,,,
,Dependent variable: np.log(salary),Dependent variable: np.log(salary),Dependent variable: np.log(salary)
,,,
,(1),(2),(3)
,,,
Intercept,10.523***,10.844***,10.738***
,(0.042),(0.252),(0.258)
b_s,-0.825***,-0.605***,-0.589***
,(0.200),(0.165),(0.165)
droprate,,,-0.000


In [7]:
st.render_latex()

'\\begin{table}[!htbp] \\centering\n\\begin{tabular}{@{\\extracolsep{5pt}}lccc}\n\\\\[-1.8ex]\\hline\n\\hline \\\\[-1.8ex]\n& \\multicolumn{3}{c}{\\textit{Dependent variable: np.log(salary)}} \\\n\\cr \\cline{2-4}\n\\\\[-1.8ex] & (1) & (2) & (3) \\\\\n\\hline \\\\[-1.8ex]\n Intercept & 10.523$^{***}$ & 10.844$^{***}$ & 10.738$^{***}$ \\\\\n& (0.042) & (0.252) & (0.258) \\\\\n b_s & -0.825$^{***}$ & -0.605$^{***}$ & -0.589$^{***}$ \\\\\n& (0.200) & (0.165) & (0.165) \\\\\n droprate & & & -0.000$^{}$ \\\\\n& & & (0.002) \\\\\n gradrate & & & 0.001$^{}$ \\\\\n& & & (0.001) \\\\\n np.log(enroll) & & 0.087$^{***}$ & 0.088$^{***}$ \\\\\n& & (0.007) & (0.007) \\\\\n np.log(staff) & & -0.222$^{***}$ & -0.218$^{***}$ \\\\\n& & (0.050) & (0.050) \\\\\n\\hline \\\\[-1.8ex]\n Observations & 408 & 408 & 408 \\\\\n $R^2$ & 0.040 & 0.353 & 0.361 \\\\\n Adjusted $R^2$ & 0.038 & 0.348 & 0.353 \\\\\n Residual Std. Error & 0.151 (df=406) & 0.125 (df=404) & 0.124 (df=402) \\\\\n F Statistic & 17.050$^{***

In [8]:
st.title("These are awesome results")
HTML(st.render_html())


0,1,2,3
,,,
,Dependent variable: np.log(salary),Dependent variable: np.log(salary),Dependent variable: np.log(salary)
,,,
,(1),(2),(3)
,,,
Intercept,10.523***,10.844***,10.738***
,(0.042),(0.252),(0.258)
b_s,-0.825***,-0.605***,-0.589***
,(0.200),(0.165),(0.165)
droprate,,,-0.000


In [9]:
st.custom_columns(['Model 1', 'Model 2', 'Model 3'] ,[1,1,1])
HTML(st.render_html())

0,1,2,3
,,,
,Dependent variable: np.log(salary),Dependent variable: np.log(salary),Dependent variable: np.log(salary)
,,,
,Model 1,Model 2,Model 3
,(1),(2),(3)
,,,
Intercept,10.523***,10.844***,10.738***
,(0.042),(0.252),(0.258)
b_s,-0.825***,-0.605***,-0.589***
,(0.200),(0.165),(0.165)


In [10]:
st.rename_covariates({"b_s": "Benefits Salary Ratio"})
HTML(st.render_html())

0,1,2,3
,,,
,Dependent variable: np.log(salary),Dependent variable: np.log(salary),Dependent variable: np.log(salary)
,,,
,Model 1,Model 2,Model 3
,(1),(2),(3)
,,,
Intercept,10.523***,10.844***,10.738***
,(0.042),(0.252),(0.258)
Benefits Salary Ratio,-0.825***,-0.605***,-0.589***
,(0.200),(0.165),(0.165)


In [11]:
st.custom_note_label("Some note I would like to add\n")
HTML(st.render_html())

0,1,2,3
,,,
,Dependent variable: np.log(salary),Dependent variable: np.log(salary),Dependent variable: np.log(salary)
,,,
,Model 1,Model 2,Model 3
,(1),(2),(3)
,,,
Intercept,10.523***,10.844***,10.738***
,(0.042),(0.252),(0.258)
Benefits Salary Ratio,-0.825***,-0.605***,-0.589***
,(0.200),(0.165),(0.165)


Other options to explore with this command include:
* `show_header`: display or hide model header data
* `show_model_numbers`: display or hide model numbers
* `custom_columns`: custom model names and model groupings
* `significance_levels`: change statistical significance thresholds
* `significant_digits`: change number of significant digits
* `show_confidence_intervals`: display confidence intervals instead of variance
* `dependent_variable_name`: rename dependent variable
* `rename_covariates`: rename covariates
* `covariate_order`: reorder covariates
* `reset_covariate_order`: reset covariate order to original ordering
* `show_degrees_of_freedom`: display or hide degrees of freedom
* `custom_note_label`: label notes section at bottom of table
* `add_custom_notes`: add custom notes to section at bottom of the table
* `append_notes`: display or hide statistical significance thresholds

# In class exercise 

1. Create a new notebook document

2. Use the data set `attitude`. The csv file is in the module.  
3. Using the `stargazer` package (Remember you need to have this package installed)<br>
   3.1.  Create a summary statistics table of this dataset<br>
   3.2 Run the following two regressions <br>
   $$ Rating = \beta_0 + \beta_1*Complaints + \beta_2*Privileges + \beta_3*Learning + \beta_4*Raises +          \beta_5*Critical$$
      
      <br>
      
   $$ 
   Rating = \beta_0 + \beta_1 \cdot Complaints + \beta_2 \cdot Privileges + \beta_3 \cdot Learning 
   $$ 
      
      and save the results in two objects named linear.1 and linear.2 

   3.3 Make a table with this results combined, change the name of the table to make them look better. Give names to          the models and give a label to the dpendent variable in the table as seen in the class example. 

4. Download your document into html and see how it looks. 


In [12]:
import pandas as pd 
import statsmodels.formula.api as smf
from stargazer.stargazer import Stargazer

#use other options provided in the list above there is the need to rewrite this whole notebook
df = pd.read_csv("attitude.csv")
df.describe().transpose()

FileNotFoundError: [Errno 2] No such file or directory: 'attitude.csv'

In [None]:
# Fit two linear regression models
linear_1 = smf.ols('rating ~ complaints + privileges + learning + raises + critical', data=df).fit()
linear_2 = smf.ols('rating ~ complaints + privileges + learning', data=df).fit()

# Create a table with the results of both models

Attitude_RTable = Stargazer([linear_1, linear_2])
from IPython.core.display import HTML


Attitude_RTable = Stargazer([linear_1, linear_2])
Attitude_RTable.title("Attitude Regressions ;)")
Attitude_RTable.custom_columns(['Linear 1: Rating = Complaints + Privileges + Learning + Raises + Critical', 'Linear 2: Rating = Complaints + Privileges + Learning'], [1,1])
Attitude_RTable.custom_note_label("If you are reading this, you are cool!")
HTML(Attitude_RTable.render_html())

0,1,2
,,
,Dependent variable: rating,Dependent variable: rating
,,
,Linear 1: Rating = Complaints + Privileges + Learning + Raises + Critical,Linear 2: Rating = Complaints + Privileges + Learning
,(1),(2)
,,
Intercept,11.011,11.258
,(11.704),(7.318)
complaints,0.692***,0.682***
,(0.149),(0.129)


In [None]:
# !jupyter nbconvert --to html nameoffile.ipynb

&nbsp;
<hr />
<p style="font-family:palatino; text-align: center;font-size: 15px">ECON320 Python Programming Laboratory</a></p>
<p style="font-family:palatino; text-align: center;font-size: 15px">Professor <em> Paloma Lopez de mesa Moyano</em></a></p>
<p style="font-family:palatino; text-align: center;font-size: 15px"><span style="color: #6666FF;"><em>paloma.moyano@emory.edu</em></span></p>

<p style="font-family:palatino; text-align: center;font-size: 15px">Department of Economics</a></p>
<p style="font-family:palatino; text-align: center; color: #012169;font-size: 15px">Emory University</a></p>

&nbsp;