# A Theory of Extramarital Affairs

According to the original paper by Fair (1978), http://people.stern.nyu.edu/wgreene/Lugano2013/Fair-ExtramaritalAffairs.pdf
- $v_1$ = marriage rating (1 = very poor, 2 = poor, 3 = fair, 4 = good, 5 = very good)
- $v_2$ = age (17.5 = under 20, 22 = 20-24, 27 = 25-29, 32 = 30-34, 37 = 35-39, 42 = 40+)
- $v_3$ = years married (0.5 = <1, 2.5 = 1-4, 6 = 5-7, 9 = 8-10, 13 = 10+ & eldest child >12, 16.5 = >10 & eldest child 12-17, 23 = 10+ & eldest child >18)
- $v_4$ = number of children (1 = 1, 2 = 2, 3 = 3, 4 = 4, 5 = 5+)
- $v_5$ = religiosity (1 = not, 2 = mildly, 3 = fairly, 4 = strongly)
- $v_6$ = level of education (9 = grade school, 12 = some high school, 14 = some college, 16 = college grad, 17 = some graduate school, 20 = advanced degree)
- $v_7$ = occupation (1 = student, 2 = farming/agriculture/semiskilled/unskilled, 3 = white-collar, 4 = teacher/counselor/social worker/nurse or artist/writer/technician/skilled worker, 5 = managerial/administrative/business, 6 = professional w/ advanced degree)
- $v_8$ = husband's occupation (same as $v_7$)
- $y_{rb}$ = proportion of time spent in marital affairs

### Importing and cleaning the data

In [12]:
clear

import delimited id constant rating age years children religiosity education notused1 occupation husbandocc yrb notused2 notused3 using "TableF17-2.csv"
// import delimited TableF17-2.csv

drop notused1 notused2 notused3

gen A = yrb > 0

summarize



(14 vars, 6,366 obs)




    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
          id |      6,366    8932.883    5203.925          3      19020
    constant |      6,366           1           0          1          1
      rating |      6,366    4.109645    .9614296          1          5
         age |      6,366    29.08286    6.847882       17.5         42
       years |      6,366    9.009425     7.28012         .5         23
-------------+---------------------------------------------------------
    children |      6,366    1.396874    1.433471          0        5.5
 religiosity |      6,366     2.42617    .8783688          1          4
   education |      6,366    14.20986    2.178003          9         20
  occupation |      6,366    3.424128    .9423987          1          6
  husbandocc |      6,366    3.850141    1.346435          1          6
-------------+-----------------------

a) The regressors of interest are v1 to v8; however, not necessarily all of them belong in your model. 
- Use these data to build a binary choice model for A. 
- Report all computed results for the model. 
- Compute the marginal effects for the variables you choose. 
- Compare the results you obtain for a probit model to those for a logit model. 
- Are there any substantial differences in the results for the two models?

https://warwick.ac.uk/fac/soc/economics/staff/academic/corradi/teaching-ec976/msfe-week9.pdf

In [15]:
probit A rating age years children religiosity education i.occupation i.husbandocc
// probit A v1 v2 v3 v4 v5 v6 i.v7 i.v8


Iteration 0:   log likelihood =   -4002.53  
Iteration 1:   log likelihood = -3455.3989  
Iteration 2:   log likelihood = -3454.0117  
Iteration 3:   log likelihood = -3454.0116  

Probit regression                               Number of obs     =      6,366
                                                LR chi2(16)       =    1097.04
                                                Prob > chi2       =     0.0000
Log likelihood = -3454.0116                     Pseudo R2         =     0.1370

------------------------------------------------------------------------------
           A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      rating |    -.42506   .0183553   -23.16   0.000    -.4610357   -.3890843
         age |  -.0359131   .0060622    -5.92   0.000    -.0477948   -.0240314
       years |   .0642991    .006468     9.94   0.000     .0516221    .0769761
    children |   .0086311  

In [3]:
probit A v1 v2 v3 v4 i.v5 v6 v7 v8


Iteration 0:   log likelihood =   -4002.53  
Iteration 1:   log likelihood = -3467.9026  
Iteration 2:   log likelihood = -3466.2814  
Iteration 3:   log likelihood =  -3466.281  
Iteration 4:   log likelihood =  -3466.281  

Probit regression                               Number of obs     =      6,366
                                                LR chi2(10)       =    1072.50
                                                Prob > chi2       =     0.0000
Log likelihood =  -3466.281                     Pseudo R2         =     0.1340

------------------------------------------------------------------------------
           A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          v1 |  -.4274984   .0183522   -23.29   0.000    -.4634681   -.3915288
          v2 |  -.0356655   .0060395    -5.91   0.000    -.0475028   -.0238283
          v3 |    .066022   .0064506    10.24   0.000     .05

In [4]:
margins, dydx(*) atmeans


Conditional marginal effects                    Number of obs     =      6,366
Model VCE    : OIM

Expression   : Pr(A), predict()
dy/dx w.r.t. : v1 v2 v3 v4 2.v5 3.v5 4.v5 v6 v7 v8
at           : v1              =    4.109645 (mean)
               v2              =    29.08286 (mean)
               v3              =    9.009425 (mean)
               v4              =    1.396874 (mean)
               1.v5            =    .1603833 (mean)
               2.v5            =    .3561106 (mean)
               3.v5            =    .3804587 (mean)
               4.v5            =    .1030474 (mean)
               v6              =    14.20986 (mean)
               v7              =    3.424128 (mean)
               v8              =    3.850141 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+---------------------------------

In [5]:
vif, uncentered


    Variable |       VIF       1/VIF  
-------------+----------------------
          v1 |     16.64    0.060082
          v2 |     75.22    0.013294
          v3 |     15.52    0.064448
          v4 |      4.89    0.204691
          v5 |
          2  |      3.06    0.326618
          3  |      3.31    0.302159
          4  |      1.69    0.591760
          v6 |     47.26    0.021158
          v7 |     17.08    0.058549
          v8 |      9.86    0.101462
-------------+----------------------
    Mean VIF |     19.45


In [6]:
logit A v1 v2 v3 v4 v5 v6 v7 v8


Iteration 0:   log likelihood =   -4002.53  
Iteration 1:   log likelihood =  -3480.896  
Iteration 2:   log likelihood = -3471.4785  
Iteration 3:   log likelihood = -3471.4714  
Iteration 4:   log likelihood = -3471.4714  

Logistic regression                             Number of obs     =      6,366
                                                LR chi2(8)        =    1062.12
                                                Prob > chi2       =     0.0000
Log likelihood = -3471.4714                     Pseudo R2         =     0.1327

------------------------------------------------------------------------------
           A |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          v1 |  -.7161071   .0314306   -22.78   0.000      -.77771   -.6545042
          v2 |  -.0604877    .010278    -5.89   0.000    -.0806322   -.0403432
          v3 |   .1100179   .0109429    10.05   0.000     .08

In [7]:
margins, dydx(*) atmeans


Conditional marginal effects                    Number of obs     =      6,366
Model VCE    : OIM

Expression   : Pr(A), predict()
dy/dx w.r.t. : v1 v2 v3 v4 v5 v6 v7 v8
at           : v1              =    4.109645 (mean)
               v2              =    29.08286 (mean)
               v3              =    9.009425 (mean)
               v4              =    1.396874 (mean)
               v5              =     2.42617 (mean)
               v6              =    14.20986 (mean)
               v7              =    3.424128 (mean)
               v8              =    3.850141 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          v1 |  -.1494827   .0065536   -22.81   0.000    -.1623274   -.1366379
          v2 |  -.0126264   .0021409    -5.90   0.000    

In [8]:
vif, uncentered


    Variable |       VIF       1/VIF  
-------------+----------------------
          v2 |     74.51    0.013421
          v6 |     47.62    0.020999
          v7 |     17.08    0.058550
          v1 |     16.87    0.059289
          v3 |     15.37    0.065044
          v8 |      9.85    0.101538
          v5 |      8.68    0.115246
          v4 |      4.89    0.204491
-------------+----------------------
    Mean VIF |     24.36


b) Continuing the analysis from part a), we now consider the self—reported rating, W. This is a natural
candidate for an ordered choice model, because the simple five-item coding is a censored version of
what would be a continuous scale on some subjective satisfaction variable. 
- Analyze this variable using an ordered probit model. 
- What variables appear to explain the response to this survey question? 
- Can you obtain the marginal effects for your model? Report them as well. 
- What do they suggest about the impact of the different independent variables on the reported ratings?

In [9]:
oprobit v1 v2 v3 v4 v5 v6 v7 v8


Iteration 0:   log likelihood = -7926.4872  
Iteration 1:   log likelihood = -7820.1782  
Iteration 2:   log likelihood = -7820.1602  
Iteration 3:   log likelihood = -7820.1602  

Ordered probit regression                       Number of obs     =      6,366
                                                LR chi2(7)        =     212.65
                                                Prob > chi2       =     0.0000
Log likelihood = -7820.1602                     Pseudo R2         =     0.0134

------------------------------------------------------------------------------
          v1 |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
          v2 |  -.0051292   .0047036    -1.09   0.275    -.0143481    .0040896
          v3 |  -.0078737    .005051    -1.56   0.119    -.0177735    .0020262
          v4 |  -.0569631    .015174    -3.75   0.000    -.0867037   -.0272226
          v5 |   .1309812  

In [10]:
margins, dydx(*) atmeans


Conditional marginal effects                    Number of obs     =      6,366
Model VCE    : OIM

dy/dx w.r.t. : v2 v3 v4 v5 v6 v7 v8
1._predict   : Pr(v1==1), predict(pr outcome(1))
2._predict   : Pr(v1==2), predict(pr outcome(2))
3._predict   : Pr(v1==3), predict(pr outcome(3))
4._predict   : Pr(v1==4), predict(pr outcome(4))
5._predict   : Pr(v1==5), predict(pr outcome(5))
at           : v2              =    29.08286 (mean)
               v3              =    9.009425 (mean)
               v4              =    1.396874 (mean)
               v5              =     2.42617 (mean)
               v6              =    14.20986 (mean)
               v7              =    3.424128 (mean)
               v8              =    3.850141 (mean)

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+-------------------------------------------