# Methods in Natural Experiment
## Weighing Up the Evidence: Making Evidence-Informed Guidance Accurate, Achievable and Acceptable



## Case study - UHC coverage health coverage and Under-five Mortality in microcosm of sub-Saharan Africa

## Methods

- Meta-analysis
- Blinder Oaxaca decomposition analysis
- Propensity score matching
- Population attributable risk fractions

## Objectives

- To explore whether there is difference in rates of under-five mortality between those with access to UHC health service indicators
- To explore the UHC related inequality in the magnitude and determinants of childhood mortality across sub-Saharan Africa
- to examine whether no access to UHC is associated with higher childhood mortality  in sub-Saharan Africa
- To evaluate and compare childhood mortality attributable to non-access to UHC in four sub-Saharan African countries.

# Part 1: Exploring Association between Childhood Mortality and Access to UHC

In [1]:
use uhc_ssa, clear

In [2]:
tab dead


       dead |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |      6,550       94.90       94.90
          1 |        352        5.10      100.00
------------+-----------------------------------
      Total |      6,902      100.00


In [3]:
tab nUHC


       nUHC |      Freq.     Percent        Cum.
------------+-----------------------------------
          0 |      6,575       95.26       95.26
          1 |        327        4.74      100.00
------------+-----------------------------------
      Total |      6,902      100.00


In [4]:
tab country


    country |      Freq.     Percent        Cum.
------------+-----------------------------------
     Angola |      2,168       31.41       31.41
      Benin |      2,511       36.38       67.79
   Ethiopia |      1,941       28.12       95.91
SouthAfrica |        282        4.09      100.00
------------+-----------------------------------
      Total |      6,902      100.00


In [5]:
tab dead nUHC, row


+----------------+
| Key            |
|----------------|
|   frequency    |
| row percentage |
+----------------+

           |         nUHC
      dead |         0          1 |     Total
-----------+----------------------+----------
         0 |     6,275        275 |     6,550 
           |     95.80       4.20 |    100.00 
-----------+----------------------+----------
         1 |       300         52 |       352 
           |     85.23      14.77 |    100.00 
-----------+----------------------+----------
     Total |     6,575        327 |     6,902 
           |     95.26       4.74 |    100.00 


In [6]:
bysort country: tab dead nUHC, row


--------------------------------------------------------------------------------
-> country = Angola

+----------------+
| Key            |
|----------------|
|   frequency    |
| row percentage |
+----------------+

           |         nUHC
      dead |         0          1 |     Total
-----------+----------------------+----------
         0 |     1,946        120 |     2,066 
           |     94.19       5.81 |    100.00 
-----------+----------------------+----------
         1 |        90         12 |       102 
           |     88.24      11.76 |    100.00 
-----------+----------------------+----------
     Total |     2,036        132 |     2,168 
           |     93.91       6.09 |    100.00 

--------------------------------------------------------------------------------
-> country = Benin

+----------------+
| Key            |
|----------------|
|   frequency    |
| row percentage |
+----------------+

           |         nUHC
      dead |         0          1 |     Total
-

In [7]:
 mhodds dead nUHC , by(country)


Maximum likelihood estimate of the odds ratio
Comparing nUHC==1 vs. nUHC==0
by country

-------------------------------------------------------------------------------
  country | Odds Ratio        chi2(1)         P>chi2       [95% Conf. Interval]
----------+--------------------------------------------------------------------
   Angola |   2.162222           6.03         0.0141         1.15050    4.06362
    Benin |   5.694746          66.37         0.0000         3.54748    9.14174
 Ethiopia |   5.194779          30.84         0.0000         2.71299    9.94686
 SouthAfr |   0.000000           0.14         0.7114               .          .
-------------------------------------------------------------------------------

    Mantel-Haenszel estimate controlling for country
    ----------------------------------------------------------------
     Odds Ratio    chi2(1)        P>chi2        [95% Conf. Interval]
    ----------------------------------------------------------------
       4.0

In [None]:
/*STEPS in conducting Binary Data meta-analysis, i.e. mortality (yes or no), hypertensive (yes or no)

STEP 1 - LOAD DATA
STEP 2 - DECLARE, UPDATE & DESCRIBE meta data
STEP 3 - SUMMARIZE meta data by using a TABLE or a FOREST PLOT
STEP 4- EXPLORE HETEROGENEITY - SUB-GROUP and META-REGRESSION analysis
STEP 5- EXPLORE and ADDRESS SMALL-STUDY EFFECTS*/

In [None]:
Binary-outcome summaries        # of successes (treated)
                                # of failures (treated)
                                # of successes (controls)
                                # of failures (controls)

In [None]:
Binary type	Description
lnoratio	log odds-ratio; the default
lnrratio	log risk-ratio (also known as log rate ratio and log relative risk
rdiff	risk difference
lnorpeto	Peto’s log odds-ratio

In [8]:
*Experimental arm
gen dead1   = (nUHC == 1) & (dead == 1)
gen nodead1 = (nUHC == 1) & (dead == 0)

In [9]:
*Control arm
gen dead0   = (nUHC == 0) & (dead == 1)
gen nodead0 = (nUHC == 0) & (dead == 0)

In [10]:
preserve

In [11]:
collapse (sum) dead1 nodead1 dead0 nodead0, by(country)

In [12]:
list


     +-------------------------------------------------+
     |     country   dead1   nodead1   dead0   nodead0 |
     |-------------------------------------------------|
  1. |      Angola      12       120      90      1946 |
  2. |       Benin      27        99     109      2276 |
  3. |    Ethiopia      13        54      83      1791 |
  4. | SouthAfrica       0         2      18       262 |
     +-------------------------------------------------+


In [17]:
meta esize dead1 nodead1 dead0 nodead0, studylabel(country) esize(lnrratio)


Meta-analysis setting information

 Study information
    No. of studies:  4
       Study label:  country
        Study size:  _meta_studysize
      Summary data:  dead1 nodead1 dead0 nodead0

       Effect size
              Type:  lnrratio
             Label:  Log Risk-Ratio
          Variable:  _meta_es
   Zero-cells adj.:  0.5, only0

         Precision
         Std. Err.:  _meta_se
                CI:  [_meta_cil, _meta_ciu]
          CI level:  95%

  Model and method
             Model:  Random-effects
            Method:  REML


In [14]:
meta summarize, fixed


  Effect-size label:  Risk Diff.
        Effect size:  _meta_es
          Std. Err.:  _meta_se
        Study label:  country

Meta-analysis summary                     Number of studies =      4
Fixed-effects model                       Heterogeneity:
Method: Mantel-Haenszel                             I2 (%) =   98.11
                                                        H2 =   52.85

--------------------------------------------------------------------
            Study |     Risk Diff.    [95% Conf. Interval]  % Weight
------------------+-------------------------------------------------
                1 |          0.047      -0.003       0.097     39.95
                2 |          0.169       0.096       0.241     38.57
                3 |          0.150       0.055       0.245     20.85
                4 |         -0.064      -0.093      -0.036      0.64
------------------+-------------------------------------------------
            theta |          0.114       0.075       0.1

In [15]:
meta forestplot, random
graph display



  Effect-size label:  Risk Diff.
        Effect size:  _meta_es
          Std. Err.:  _meta_se
        Study label:  country


This front-end cannot display the desired image type.






In [18]:
#delimit ;
    meta forestplot, random nullrefline 
    columnopts(_data1, supertitle(No UHC))  
    columnopts(_data2, supertitle(UHC)) 
    columnopts(_a _c, title(Dead)) 
    columnopts(_b _d, title(Alive));
#delimit cr
graph display



  Effect-size label:  Log Risk-Ratio
        Effect size:  _meta_es
          Std. Err.:  _meta_se
        Study label:  country


This front-end cannot display the desired image type.






# Mind the Gap: What explains the education-related inequality in non-access to Universal Health Coverage health service indicators in sub-Saharan Africa? Compositional and structural characteristics

- The Blinder-Oaxaca method allows for the decomposition of the difference in an outcome variable between 2 groups into 2 components. 

- The first component is the “explained” portion of that gap that captures differences in the distributions of the measurable characteristics (referred to as the “compositional” or “endowments”) of these groups.  

- Using this method, we can quantify how much of the gap between the “advantaged” and the “disadvantaged” groups is attributable to differences in specific measurable characteristics. 

- The second component is the “unexplained” part, or structural component which captures the gap due to the differences in the regression coefficients and the unmeasured variables between the two groups. 

- This reflects the remainder of the model not explained by the differences in measurable, objective characteristics. 

- The “unexplained” portion arises from differentials in how the predictor variables are associated with the outcomes for the two groups. 

- This portion would persist even if the disadvantaged group were to attain the same average levels of measured predictor variables as the advantaged group.


In [19]:
use uhc_ssa, clear

In [20]:
global ind "kid_male mat_age noed mat_currwork mat_dec3 kid_bord kid_u5c mat_hinsur mat_fhead media_access mat_wealth"
global com "rural  com_poverty_hl com_uemp_hl com_illit_hl com_diversity_hl"
global country "country"

In [30]:
oaxaca dead  $ind $com , by(nUHC)   logit   pooled relax

(mat_currwork mat_hinsur dropped from model 1)
(mat_currwork mat_hinsur dropped from model 2)
(mat_currwork mat_hinsur dropped from pooled model)
(mat_currwork mat_hinsur missing in model 1; assumed zero)
(mat_currwork mat_hinsur missing in model 2; assumed zero)
(mat_currwork mat_hinsur missing in pooled model; assumed zero)

Blinder-Oaxaca decomposition                    Number of obs     =      2,957
                                                  Model           =      logit
Group 1: nUHC = 0                                 N of obs 1      =       2824
Group 2: nUHC = 1                                 N of obs 2      =        133

-------------------------------------------------------------------------------
              |               Robust
         dead |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
overall       |
      group_1 |   .0453258   .0038436    11.79   0.000     .0

In [24]:
coefplot (., keep(explained:*)),  bylabel("Explained") || ///
(., keep(unexplained:*)),  bylabel("Unexplained") || , ///
drop(*:_cons) recast(bar) barwidth(0.5) citop ciopts(recast(rcap) color(black)) ///
byopts(cols(1)) xline(0, lpattern(dash)) xlabel(, grid glstyle(minor_grid) glpattern(dash))


This front-end cannot display the desired image type.

# Does access to UHC reduces childhood mortality ? a natural experiment from microcosm of sub-Saharan Africa

- We examined the baseline characteristics of the respondents and estimated standardised differences for all variables before and after matching. 

- A standardised difference of 10% or more is suggestive of imbalance. 

- We used propensity score methods to account for all measured differences in baseline characteristics between respondents with UHC and those without UHC. 

- The propensity score approach was used to control for all observed confounding factors that might influence assignment and outcome. 

- We constructed a sample of patients balanced on covariates and risk factors (listed above). We constructed the propensity scores using a logistic regression. 

- We then matched each respondent with health insurance with the closest propensity score on a ratio of 1:5 using a nearest neighbour algorithm with no replacement. 

- We calculated the average treatment effect on insurance patients, which is a measure of the impact of UHC on whether on childhood mortality rate. 

- We calculated the absolute difference in probability of access of childhood mortality those with and without UHC in the propensity score–matched cohort. 



<img src="./psm1.jpg" width="500" height="300" />

![alt text](./psm2.jpg)

In [31]:
global factors "b4.country b1.kid_male b3.mat_cage b5.mat_wealth b2.mat_edu b1.rural b1.mat_currwork b3.media_access"

In [32]:
*PSM
#delimit ;
	psmatch2 CCI_non  $factors ,
	out(dead) logit ate  llr neighbor(5)
	;
#delimit cr


Logistic regression                             Number of obs     =      6,902
                                                LR chi2(17)       =     161.51
                                                Prob > chi2       =     0.0000
Log likelihood = -1235.5928                     Pseudo R2         =     0.0613

-------------------------------------------------------------------------------
      CCI_non |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
      country |
      Angola  |   1.730185   .7394853     2.34   0.019     .2808205     3.17955
       Benin  |   1.473184   .7455959     1.98   0.048     .0118428    2.934525
    Ethiopia  |   .8965639   .7482877     1.20   0.231    -.5700531    2.363181
              |
   0.kid_male |   .0241363   .1149054     0.21   0.834    -.2010742    .2493468
              |
     mat_cage |
       13-24  |  -.1173226   .1641108    -0.71   0.475    

In [33]:
pstest, both  graph


--------------------------------------------------------------------------------
> --------
                Unmatched |       Mean               %reduct |     t-test    |  
> V(T)/
Variable          Matched | Treated Control    %bias  |bias| |    t    p>|t| |  
> V(C)
--------------------------+----------------------------------+---------------+--
> --------
1.country              U  | .40367   .30966     19.7         |   3.58  0.000 |  
>    .
                       M  | .40367   .41957     -3.3    83.1 |  -0.41  0.680 |  
>    .
                          |                                  |               |
2.country              U  | .38532   .36274      4.7         |   0.83  0.407 |  
>    .
                       M  | .38532   .37554      2.0    56.7 |   0.26  0.797 |  
>    .
                          |                                  |               |
3.country              U  | .20489   .28502    -18.7         |  -3.15  0.002 |  
>    .
                       M  | .20489   .20

In [34]:
graph display

This front-end cannot display the desired image type.

In [35]:
gen pair1 = _id if _treated==0
replace pair1 = _n1 if _treated==1
gen pair2 = _id if _treated==0
replace pair2 = _n2 if _treated==1
gen pair3 = _id if _treated==0
replace pair3 = _n3 if _treated==1
gen pair4 = _id if _treated==0
replace pair4 = _n4 if _treated==1
gen pair5 = _id if _treated==0
replace pair5 = _n5 if _treated==1

bysort pair1: egen paircount1 = count(pair1)
bysort pair2: egen paircount2 = count(pair2)
bysort pair3: egen paircount3 = count(pair3)
bysort pair4: egen paircount4 = count(pair4)
bysort pair5: egen paircount5 = count(pair5)
egen byte paircount = anycount(paircount1 paircount2 paircount3 paircount4  paircount5), values(2)
drop if paircount==0


(327 missing values generated)

(327 real changes made)

(327 missing values generated)

(327 real changes made)

(327 missing values generated)

(327 real changes made)

(327 missing values generated)

(327 real changes made)

(327 missing values generated)

(327 real changes made)







(5,940 observations deleted)


In [36]:
tab _treated


  psmatch2: |
  Treatment |
 assignment |      Freq.     Percent        Cum.
------------+-----------------------------------
  Untreated |        796       82.74       82.74
    Treated |        166       17.26      100.00
------------+-----------------------------------
      Total |        962      100.00


In [37]:
cs  dead _treated


                 | psmatch2: Treatment    |
                 | assignment             |
                 |   Exposed   Unexposed  |      Total
-----------------+------------------------+-----------
           Cases |        26          30  |         56
        Noncases |       140         766  |        906
-----------------+------------------------+-----------
           Total |       166         796  |        962
                 |                        |
            Risk |  .1566265    .0376884  |   .0582121
                 |                        |
                 |      Point estimate    |    [95% Conf. Interval]
                 |------------------------+------------------------
 Risk difference |         .1189381       |    .0620885    .1757876 
      Risk ratio |         4.155823       |    2.526122    6.836908 
 Attr. frac. ex. |         .7593738       |    .6041364    .8537351 
 Attr. frac. pop |         .3525664       |
                 +---------------------------------

# Proportion of childhood mortality cases attributable to non-access to Universal Health Coverage: a microcosm analysis of sub-Saharan Africa 

- The potential attributable fraction (PAF) measure was calculated as the number of excess incident childhood death among those with access to UHC divided by the total number of estimated incident childhood death among those with no access to UHC.  

- The PAF describes the proportion of childhood mortality that could hypothetically be prevented if the mother has access to UHC health service indicators.  

- We also calculated potential impact fraction, avoidable burden refers to the potential reduction in future burden of disease or health outcome that could be attained by changing the current distribution of risk factors to an alternative distribution of risk factors. 

- We used theoretical minimum risk which refers to the exposure distribution that would result in the lowest population-level risk, regardless of whether currently achievable.


<img src="./paf1.png" />

<img src="./paf2.png"/>

<img src="./paf3.png"  />

In [38]:
global factors "i.country mat_hinsur i.kid_male i.mat_cage i.mat_wealth  i.rural i.mat_currwork i.media_access"

In [39]:
logit dead  nUHC $factors, or


note: 4.country != 0 predicts failure perfectly
      4.country dropped and 7 obs not used

Iteration 0:   log likelihood = -213.16119  
Iteration 1:   log likelihood = -199.45972  
Iteration 2:   log likelihood = -190.08857  
Iteration 3:   log likelihood = -190.03601  
Iteration 4:   log likelihood = -190.03596  
Iteration 5:   log likelihood = -190.03596  

Logistic regression                             Number of obs     =        955
                                                LR chi2(16)       =      46.25
                                                Prob > chi2       =     0.0001
Log likelihood = -190.03596                     Pseudo R2         =     0.1085

-------------------------------------------------------------------------------
         dead | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
--------------+----------------------------------------------------------------
         nUHC |   5.235968   1.537676     5.64   0.000     2.944556    9.310523

## Population attributable risk (PAR)

In [40]:
regpar, at(nUHC=0)


Scenario 0: (asobserved) _all
Scenario 1: nUHC=0
Symmetric confidence intervals for the logit proportions
under Scenario 0 and Scenario 1
and for the z-transformed population attributable risk (PAR)
Total number of observations used: 955
------------------------------------------------------------------------------
             |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  Scenario_0 |  -2.775931   .1330844   -20.86   0.000    -3.036772   -2.515091
  Scenario_1 |  -3.241028   .1848111   -17.54   0.000    -3.603252   -2.878805
         PAR |   .0209912   .0049207     4.27   0.000     .0113469    .0306355
------------------------------------------------------------------------------

Asymmetric 95% CIs for the untransformed proportions
under Scenario 0 and Scenario 1
and for the untransformed population attributable risk (PAR)
                Estimate     Minimum     Maximum 
  Scenario_0

In [None]:
Asymmetric 95% CIs for the untransformed proportions
under Scenario 0 and Scenario 1
and for the untransformed population attributable risk (PAR)
                Estimate     Minimum     Maximum 
  Scenario_0   .05863874   .04579201   .07480702 
  Scenario_1   .03765061   .02651294    .0532113 
         PAR   .02098813   .01134642   .03062594 

- In current (factual) scenario, the childhood mortality was 59 per 1000 
- but in the theoretical minimum scenario, the childhood mortality was 38 per 
- The proportion of childhood deaths that would have been avoided if all mother had access to UHC was 2.1%, 
suggesting that excess 21 childhood death could have been avoided for every 1000 children.


In [None]:
Asymmetric 95% CIs for the untransformed proportions
under Scenario 0 and Scenario 1
and for the untransformed population attributable risk (PAR)
                Estimate     Minimum     Maximum 
  Scenario_0   .05863874   .04579201   .07480702 
  Scenario_1   .03765061   .02651294    .0532113 
         PAR   .02098813   .01134642   .03062594 

We see that in the real world (Scenario 0), 5.9% of children are expected to die before the 5th birthday but 
that in the dream scenario where all mother have access to UHC (Scenario 1), 
only 3.8% of children are expected to die before the 5th birthday. 
The diﬀerence between these scenario percentages (PAR) is 2.1%, 
with conﬁdence limits from 3.2% to 13.5%. 

The PAR can be interpreted as the proportion of all babies that have low birthweight because 
they were born in scenario 0 instead of in scenario 1.

In [None]:
regpar, at(nUHC=0) subpop(if rural ==1)
regpar, at(nUHC=0) subpop(if rural ==0)

In [None]:
regpar, at(nUHC=0) subpop(if country == 1)
regpar, at(nUHC=0) subpop(if country == 2)
regpar, at(nUHC=0) subpop(if country == 3)
regpar, at(nUHC=0) subpop(if country == 4)

## Potential Impact Fraction

In [41]:
logit dead  nUHC `factor', or


Iteration 0:   log likelihood =  -213.5826  
Iteration 1:   log likelihood = -206.09556  
Iteration 2:   log likelihood = -199.85003  
Iteration 3:   log likelihood = -199.82902  
Iteration 4:   log likelihood = -199.82901  

Logistic regression                             Number of obs     =        962
                                                LR chi2(1)        =      27.51
                                                Prob > chi2       =     0.0000
Log likelihood = -199.82901                     Pseudo R2         =     0.0644

------------------------------------------------------------------------------
        dead | Odds Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        nUHC |   4.741905   1.343251     5.49   0.000     2.721642    8.261801
       _cons |   .0391645   .0072891   -17.41   0.000     .0271939    .0564045
-------------------------------------------------------------

In [42]:
punaf, at(nUHC=0) eform

*13.9% of the ‘disease burden’ of childhood mortality might be eliminated by providing access to UHC health service indicators to all child, with confidence limits from 8.9% to 18.5%. 

Scenario 0: (asobserved) _all
Scenario 1: nUHC=0
Confidence intervals for the means under Scenario 0 and Scenario 1
and for the population unattributable faction (PUF)
Total number of observations used: 962
------------------------------------------------------------------------------
             | Mean/Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  Scenario_0 |   .0582121   .0074087   -22.34   0.000     .0453606    .0747045
  Scenario_1 |   .0376884     .00675   -18.30   0.000     .0265312    .0535376
         PUF |   .6474336     .07635    -3.69   0.000     .5138252    .8157837
------------------------------------------------------------------------------

95% CI for the population attributable fraction (PAF)
                Estimate     Minimum     Maximum 
         PAF   .35256644    .1842163    .4861748 


In [None]:
Scenario 0: (asobserved) _all
Scenario 1: nUHC=0
Confidence intervals for the means under Scenario 0 and Scenario 1
and for the population unattributable faction (PUF)
Total number of observations used: 962
------------------------------------------------------------------------------
             | Mean/Ratio   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
  Scenario_0 |   .0582121   .0074087   -22.34   0.000     .0453606    .0747045
  Scenario_1 |   .0376884     .00675   -18.30   0.000     .0265312    .0535376
         PUF |   .6474336     .07635    -3.69   0.000     .5138252    .8157837
------------------------------------------------------------------------------

95% CI for the population attributable fraction (PAF)
                Estimate     Minimum     Maximum 
         PAF   .35256644    .1842163    .4861748 

- 35.2% of the ‘disease burden’ of childhood mortality might be eliminated by providing access to 
UHC health service indicators to all child, with confidence limits from 18.4% to 48.6%. 

In [None]:
punaf, at(nUHC=0) eform subpop(if rural ==1)
punaf, at(nUHC=0) eform subpop(if rural ==0)

In [None]:
punaf, at(nUHC=0) eform subpop(if country == 1)
punaf, at(nUHC=0) eform subpop(if country == 2)
punaf, at(nUHC=0) eform subpop(if country == 3)
punaf, at(nUHC=0) eform subpop(if country == 4)