# Chapter 02

In [1]:
import wooldridge as wr
import pandas as pd
import numpy as np
import statsmodels.api as sm
import statsmodels.formula.api as smf

pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

### C1

In [2]:
df_401 = wr.data('401k')
df_401.head()

Unnamed: 0,prate,mrate,totpart,totelg,age,totemp,sole,ltotemp
0,26.1,0.21,1653.0,6322.0,8,8709.0,0,9.072112
1,100.0,1.42,262.0,262.0,6,315.0,1,5.752573
2,97.599998,0.91,166.0,170.0,10,275.0,1,5.616771
3,100.0,0.42,257.0,257.0,7,500.0,0,6.214608
4,82.5,0.53,591.0,716.0,28,933.0,1,6.838405


In [3]:
#dataset desc
wr.data('401k', description=True)

name of dataset: 401k
no of variables: 8
no of observations: 1534

+----------+---------------------------------+
| variable | label                           |
+----------+---------------------------------+
| prate    | participation rate, percent     |
| mrate    | 401k plan match rate            |
| totpart  | total 401k participants         |
| totelg   | total eligible for 401k plan    |
| age      | age of 401k plan                |
| totemp   | total number of firm employees  |
| sole     | = 1 if 401k is firm's sole plan |
| ltotemp  | log of totemp                   |
+----------+---------------------------------+

L.E. Papke (1995), “Participation in and Contributions to 401(k)
Pension Plans:Evidence from Plan Data,” Journal of Human Resources 30,
311-325. Professor Papke kindly provided these data. She gathered them
from the Internal Revenue Service’s Form 5500 tapes.


In [4]:
# C1 i

## mean participation rate
display(df_401['prate'].mean())

## mean plan match rate
display(df_401['mrate'].mean())

87.3629074562948

0.7315123849943027

In [5]:
# C1 ii
model = smf.ols('prate ~ mrate',
                data = df_401).fit()

print(model.summary())
print("From summary above we've N = 1534 and R2 = .075")

                            OLS Regression Results                            
Dep. Variable:                  prate   R-squared:                       0.075
Model:                            OLS   Adj. R-squared:                  0.074
Method:                 Least Squares   F-statistic:                     123.7
Date:                Mon, 08 May 2023   Prob (F-statistic):           1.10e-27
Time:                        20:54:52   Log-Likelihood:                -6437.0
No. Observations:                1534   AIC:                         1.288e+04
Df Residuals:                    1532   BIC:                         1.289e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     83.0755      0.563    147.484      0.0

#### c1 iii

The intercept could be considered the participartion rate without macthing rate. (83% of workforce). The mrate coef. is the increase of participation for each dollar matching (ex. for each dollar on matching contribution, we've a 5% of additional participation)


In [6]:
## C1 iv
model.predict(exog=dict(mrate=3.5))

0    103.589233
dtype: float64

With more contribution more the participants we have in the program

#### C1 v

7.5% of the prate variation is explained for mrate. This is a lower R2, so we have a many anothers factors on error term that helps explain the prate variation.

### C2

In [7]:
df_ceo = wr.data('ceosal2')
df_ceo.head()

Unnamed: 0,salary,age,college,grad,comten,ceoten,sales,profits,mktval,lsalary,lsales,lmktval,comtensq,ceotensq,profmarg
0,1161,49,1,1,9,2,6200.0,966,23200.0,7.057037,8.732305,10.051908,81,4,15.580646
1,600,43,1,1,10,10,283.0,48,1100.0,6.39693,5.645447,7.003066,100,100,16.96113
2,379,51,1,1,9,3,169.0,40,1100.0,5.937536,5.129899,7.003066,81,9,23.668638
3,651,55,1,0,22,22,1100.0,-54,1000.0,6.478509,7.003066,6.907755,484,484,-4.909091
4,497,44,1,1,8,6,351.0,28,387.0,6.20859,5.860786,5.958425,64,36,7.977208


In [8]:
#dataset desc
wr.data('ceosal2', description=True)

name of dataset: ceosal2
no of variables: 15
no of observations: 177

+----------+--------------------------------+
| variable | label                          |
+----------+--------------------------------+
| salary   | 1990 compensation, $1000s      |
| age      | in years                       |
| college  | =1 if attended college         |
| grad     | =1 if attended graduate school |
| comten   | years with company             |
| ceoten   | years as ceo with company      |
| sales    | 1990 firm sales, millions      |
| profits  | 1990 profits, millions         |
| mktval   | market value, end 1990, mills. |
| lsalary  | log(salary)                    |
| lsales   | log(sales)                     |
| lmktval  | log(mktval)                    |
| comtensq | comten^2                       |
| ceotensq | ceoten^2                       |
| profmarg | profits as % of sales          |
+----------+--------------------------------+

See CEOSAL1.RAW


In [9]:
## C2 i

display(df_ceo['salary'].mean())
display(df_ceo['ceoten'].mean())

865.8644067796611

7.954802259887006

In [10]:
## C2 ii

display(df_ceo[df_ceo['ceoten'] == 0].shape[0]) #N CEO's in the first year
display(df_ceo['ceoten'].max()) #max year as CEO

5

37

In [11]:
## C2 iii

model = smf.ols('lsalary ~ ceoten',
                data = df_ceo).fit()
model.summary()

0,1,2,3
Dep. Variable:,lsalary,R-squared:,0.013
Model:,OLS,Adj. R-squared:,0.008
Method:,Least Squares,F-statistic:,2.334
Date:,"Mon, 08 May 2023",Prob (F-statistic):,0.128
Time:,20:54:53,Log-Likelihood:,-160.84
No. Observations:,177,AIC:,325.7
Df Residuals:,175,BIC:,332.0
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,6.5055,0.068,95.682,0.000,6.371,6.640
ceoten,0.0097,0.006,1.528,0.128,-0.003,0.022

0,1,2,3
Omnibus:,3.858,Durbin-Watson:,2.084
Prob(Omnibus):,0.145,Jarque-Bera (JB):,3.907
Skew:,-0.189,Prob(JB):,0.142
Kurtosis:,3.622,Cond. No.,16.1


One additional year as CEO increase around 1% (ceoten coef) in salary.