In [1]:
/****************************************************************************
* File name: empirical_exercise10.do
* Author(s): Sze, J.
* Date: 5/1/2019
* Description: 
* Answers to empirical exercise 10 for Labor Economics
*
* Inputs: 
* "..\input_data\Small CPS, 2018”
* Outputs:
* 
***************************************************************************/

### Empirical Exercise 10
Your task is to estimate black–white wage gaps in 2018. Use the “Small CPS, 2018” Stata
file, keeping only blacks and whites. Use the earnings weighting variable weight to weight all
your estimates.
To refresh your memory, tabulate values of the race, metro, size, and occupation variables.

In [2]:
use "..\input_data\Small CPS, 2018.dta",clear

(1000 Obs./Month Extract of the CPS Merged-Outgoing-Rotation-Group Files, 2018)


In [3]:
describe


Contains data from ..\input_data\Small CPS, 2018.dta
  obs:        11,000                          1000 Obs./Month Extract of the CPS Merged-Outgoing-Rotation-Group Files, 2018
 vars:            22                          17 Jan 2019 14:16
 size:       484,000                          
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
              storage   display    value
variable name   type    format     label      variable label
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
idcode          int     %9.0g               * ID Code
weight          double  %7.1f              

### A. Raw Wage and log-Wage Gaps. 

- Use summarize or mean to compute weighted means of rwage and the log of rwage (lrwage) by race. 
- Compute the black–white wage gap using rwage. 
- Compute another black–white wage gap as the difference in logs. 
- Compare the two measures.

In [4]:
sum rwage


    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
       rwage |     11,000     1063.44    933.0915   19.92612   4382.547


In [5]:
sum rwage [aweight = weight]


    Variable |     Obs      Weight        Mean   Std. Dev.       Min        Max
-------------+-----------------------------------------------------------------
       rwage |  10,984   113203560    1070.459   942.3015   19.92612   4382.547


In [6]:
gen lrwage = log(rwage)

In [7]:
tabstat lrwage [aweight = weight], by(race)


Summary for variables: lrwage
     by categories of: race (race)

            race |      mean
-----------------+----------
           White |  6.682839
           Black |  6.454452
          Indian |  6.384202
Asian/PacIslande |  6.843558
           Other |  6.471716
-----------------+----------
           Total |  6.660128
----------------------------


In [8]:
tabstat rwage [aweight = weight], by(race)


Summary for variables: rwage
     by categories of: race (race)

            race |      mean
-----------------+----------
           White |  1096.852
           Black |     838.1
          Indian |  727.9286
Asian/PacIslande |  1275.584
           Other |  851.4346
-----------------+----------
           Total |  1070.459
----------------------------


In [9]:
display "black–white wage gap is " 838.1 - 1096.852

black–white wage gap is -258.752


In [10]:
display "black–white wage gap in logs " 6.454452-6.682839

black–white wage gap in logs -.228387


The black-white wage gap in real dollar shows an average -$259 wage gap between blacks and whites. This translates to a -22.3% wage gap in logs. 

### B. log-Wage Regression. 
- Create a dummy variable black that equals one if the worker is black. 
- Regress lrwage on black. 
- How does the estimated effect of black compare to the two measures from above?

In [11]:
tab race, nolabel


       race |      Freq.     Percent        Cum.
------------+-----------------------------------
          1 |      9,040       82.18       82.18
          2 |      1,003        9.12       91.30
          3 |        116        1.05       92.35
          4 |        671        6.10       98.45
          5 |        170        1.55      100.00
------------+-----------------------------------
      Total |     11,000      100.00


In [12]:
tab race


             race |      Freq.     Percent        Cum.
------------------+-----------------------------------
            White |      9,040       82.18       82.18
            Black |      1,003        9.12       91.30
           Indian |        116        1.05       92.35
Asian/PacIslander |        671        6.10       98.45
            Other |        170        1.55      100.00
------------------+-----------------------------------
            Total |     11,000      100.00


In [13]:
gen black = (race == 2)

In [14]:
regress lrwage black


      Source |       SS           df       MS      Number of obs   =    11,000
-------------+----------------------------------   F(1, 10998)     =     70.14
       Model |  46.5531627         1  46.5531627   Prob > F        =    0.0000
    Residual |  7299.11529    10,998  .663676604   R-squared       =    0.0063
-------------+----------------------------------   Adj R-squared   =    0.0062
       Total |  7345.66845    10,999  .667848755   Root MSE        =    .81466

------------------------------------------------------------------------------
      lrwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       black |  -.2259881   .0269829    -8.38   0.000    -.2788795   -.1730967
       _cons |   6.673571   .0081479   819.06   0.000     6.657599    6.689542
------------------------------------------------------------------------------


In [15]:
regress rwage black


      Source |       SS           df       MS      Number of obs   =    11,000
-------------+----------------------------------   F(1, 10998)     =     61.52
       Model |  53271829.9         1  53271829.9   Prob > F        =    0.0000
    Residual |  9.5231e+09    10,998  865895.221   R-squared       =    0.0056
-------------+----------------------------------   Adj R-squared   =    0.0055
       Total |  9.5764e+09    10,999   870659.83   Root MSE        =    930.53

------------------------------------------------------------------------------
       rwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       black |  -241.7463   30.82079    -7.84   0.000    -302.1606    -181.332
       _cons |   1085.483   9.306745   116.63   0.000      1067.24    1103.726
------------------------------------------------------------------------------


The results from the regression are very similar to the differences obtained in part A.

### C. log-Wage Regressions With More Variables. 
1. Regress lrwage on black, hours, grade, and a quadratic in potential experience. 
    - How does the residual black–white wage gap in this case compare to the raw black-white wage gap from the log-wage regression? 
2. Add the following variables to the log-wage regression: married, veteran, public, union, and indicators for metro, size, and state. 
    - Does including these additional variables increase or decrease the residual wage gap? 
3. Add a set of variables that indicate occupation. 
    - Does including the occupation dummy variables increase or decrease the residual wage gap?

In [16]:
gen experience = age - grade - 6
replace experience = 0 if experience <0



(50 real changes made)


In [17]:
gen experience2 = experience*experience

In [18]:
regress lrwage i.black hours grade experience experience2 [aw = weight]

(sum of wgt is 113,203,559.8744)

      Source |       SS           df       MS      Number of obs   =    10,984
-------------+----------------------------------   F(5, 10978)     =   1949.54
       Model |  3412.58389         5  682.516777   Prob > F        =    0.0000
    Residual |  3843.29684    10,978  .350090804   R-squared       =    0.4703
-------------+----------------------------------   Adj R-squared   =    0.4701
       Total |  7255.88073    10,983   .66064652   Root MSE        =    .59168

------------------------------------------------------------------------------
      lrwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     1.black |  -.1835969   .0176127   -10.42   0.000    -.2181209   -.1490729
       hours |   .0298225   .0004789    62.27   0.000     .0288837    .0307612
       grade |   .1120886    .002212    50.67   0.000     .1077527    .1164245
  experience |   

When we take into account hours work, grade and experience, we find that the black-white wage gap reduces to -17.7%. 

In [19]:
regress lrwage i.black hours grade experience experience2 i.married i.veteran i.public i.union i.metro size i.state [aw = weight]

(sum of wgt is 111,762,887.9189)

      Source |       SS           df       MS      Number of obs   =    10,819
-------------+----------------------------------   F(63, 10755)    =    167.20
       Model |  3487.49244        63  55.3570228   Prob > F        =    0.0000
    Residual |  3560.76669    10,755   .33108012   R-squared       =    0.4948
-------------+----------------------------------   Adj R-squared   =    0.4918
       Total |  7048.25913    10,818  .651530702   Root MSE        =     .5754

----------------------------------------------------------------------------------------------
                      lrwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-----------------------------+----------------------------------------------------------------
                     1.black |  -.1838928   .0181714   -10.12   0.000    -.2195121   -.1482736
                       hours |   .0292757   .0004731    61.88   0.000     .0283483    .0302031
                

The addition of married, veteran, public, union, and indicators for metro, size, and state increase the residual wage gap.

In [20]:
regress lrwage i.black hours grade experience experience2 i.married i.veteran i.public i.union i.metro size i.state i. occupation[aw = weight]

(sum of wgt is 111,762,887.9189)

      Source |       SS           df       MS      Number of obs   =    10,819
-------------+----------------------------------   F(120, 10698)   =    124.88
       Model |  4112.40238       120  34.2700198   Prob > F        =    0.0000
    Residual |  2935.85675    10,698  .274430431   R-squared       =    0.5835
-------------+----------------------------------   Adj R-squared   =    0.5788
       Total |  7048.25913    10,818  .651530702   Root MSE        =    .52386

--------------------------------------------------------------------------------------------------------------------------
                                                  lrwage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
---------------------------------------------------------+----------------------------------------------------------------
                                                 1.black |  -.1184882   .0168086    -7.05   0.000    -.1514362   -.0855402

                                             new mexico  |  -.0978616    .096977    -1.01   0.313    -.2879546    .0922314
                                                arizona  |  -.1383607   .0804879    -1.72   0.086    -.2961319    .0194106
                                                   utah  |   -.118446   .0893348    -1.33   0.185    -.2935587    .0566667
                                                 nevada  |  -.0352325   .0875659    -0.40   0.687     -.206878     .136413
                                             washington  |   .0594401   .0807972     0.74   0.462    -.0989374    .2178177
                                                 oregon  |  -.0955479   .0842601    -1.13   0.257    -.2607134    .0696176
                                             california  |  -.0068522   .0756629    -0.09   0.928    -.1551656    .1414612
                                                 alaska  |   .0647297   .1336089     0.48   0.628    -.1971686    .3266279
                

                                                         |
                                                   _cons |   4.613629   .0877659    52.57   0.000     4.441592    4.785667
--------------------------------------------------------------------------------------------------------------------------


The addition of the occupation dummy variables decreased the residual wage gap.