<h1>A two-day workshop at the DJI, Munich 2019:<br> Meta-Analysis in Social Research<span class="tocSkip"></span></h1>

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Preliminaries" data-toc-modified-id="Preliminaries-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Preliminaries</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Exercise</a></span><ul class="toc-item"><li><span><a href="#The-&quot;BCG-vaccine-dataset&quot;" data-toc-modified-id="The-&quot;BCG-vaccine-dataset&quot;-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>The "BCG vaccine dataset"</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.5"><span class="toc-item-num">2.5&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.6"><span class="toc-item-num">2.6&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-2.7"><span class="toc-item-num">2.7&nbsp;&nbsp;</span>Exercise</a></span></li></ul></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Exercise</a></span><ul class="toc-item"><li><span><a href="#Exercise" data-toc-modified-id="Exercise-3.1"><span class="toc-item-num">3.1&nbsp;&nbsp;</span>Exercise</a></span></li><li><span><a href="#Exercise" data-toc-modified-id="Exercise-3.2"><span class="toc-item-num">3.2&nbsp;&nbsp;</span>Exercise</a></span></li></ul></li></ul></div>

## Preliminaries

Please do not touch anything in this section, otherwise this notebook might not work properly. You have been warned! Also, if you have no clue what you are staring at, please consult our [Preface chapter](1-1_preface.ipynb).

In [5]:
source("run_me_first.R")

Loading required package: Matrix
Loading 'metafor' package (version 2.0-0). For an overview 
and introduction to the package please type: help(metafor).


## Exercise

### The "BCG vaccine dataset"

Throughout many exercises, we will be using the "BCG vaccine dataset", which can be found in the package `metafor`. All details about his dataset can be found at [http://www.metafor-project.org/doku.php/analyses:berkey1995](http://www.metafor-project.org/doku.php/analyses:berkey1995). In brief, this dataset contains results from 13 studies examining the effectiveness of the Bacillus Calmette-Guerin (BCG) vaccine against tuberculosis.

More details can be found by executing the help command in R (at the bottom of this window should open a new window that contains the dataset description):

    ?dat.bcg 

An excerpt of `?dat.cgb` can also found in the next paragraph. The data frame contains the following columns:

The 13 studies provide data in terms of 2x2 tables in the form:

|                 .| TB positive | TB negative |
|------------------|-------------|-------------|
| vaccinated group | tpos        | tneg        |
| control group	   | cpos	     | cneg        |

The goal of the meta-analysis was to examine the overall effectiveness of the BCG vaccine for preventing tuberculosis and to examine moderators that may potentially influence the size of the effect.

The dataset has been used in several publications to illustrate meta-analytic methods (see ‘References’).

The `escalc()` function is used to calculate the effect size and corresponding variance (the log risk ratios). We take as effect size log risk ratios: `measure = "RR"` and use the default arguments for 2x2 tables. In order to specify the cells of a 2x2 table, `escalc` assumes the following order of table cells (`ai` to `di` are the `escalc` arguments the need to be specified, see below):

|                 .| TB positive | TB negative |
|------------------|-------------|-------------|
| vaccinated group | ai          | bi          |
| control group	   | ci  	     | di          |


### Exercise

Make sure that you first load the `metafor` package. Calculate the appropriate effect size for the BCG dataset using the `escalc()` function. You can add the new effect size to the existing data frame `dat.bcg` by using the following expression `dat.bcg <-escalc(...)`. What exactly does `escalc()` return?

In [6]:
## Solution.
library(metafor)
dat.bcg <- escalc(measure = "RR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg)
## escalc() returns the log(RR).

### Exercise

Inspect the BCG data by typing in the name `dat.bcg`. Are there any new columns (hopyefully, there are two of them...)? 

In [7]:
## Solution.
dat.bcg

trial,author,year,tpos,tneg,cpos,cneg,ablat,alloc,yi,vi
1,Aronson,1948,4,119,11,128,44,random,-0.88931133,0.325584765
2,Ferguson & Simes,1949,6,300,29,274,55,random,-1.58538866,0.194581121
3,Rosenthal et al,1960,3,228,11,209,42,random,-1.34807315,0.415367965
4,Hart & Sutherland,1977,62,13536,248,12619,52,random,-1.44155119,0.020010032
5,Frimodt-Moller et al,1973,33,5036,47,5761,13,alternate,-0.21754732,0.051210172
6,Stein & Aronson,1953,180,1361,372,1079,44,alternate,-0.78611559,0.006905618
7,Vandiviere et al,1973,8,2537,10,619,19,random,-1.62089822,0.223017248
8,TPT Madras,1980,505,87886,499,87892,13,random,0.01195233,0.003961579
9,Coetzee & Berjak,1968,29,7470,45,7232,27,random,-0.46941765,0.05643421
10,Rosenthal et al,1961,17,1699,65,1600,42,systematic,-1.3713448,0.073024794


In most cases, it is better to only show the first few rows of a dataset. This can be accomplished using the `head()` function:

In [8]:
head(dat.bcg)

trial,author,year,tpos,tneg,cpos,cneg,ablat,alloc,yi,vi
1,Aronson,1948,4,119,11,128,44,random,-0.8893113,0.325584765
2,Ferguson & Simes,1949,6,300,29,274,55,random,-1.5853887,0.194581121
3,Rosenthal et al,1960,3,228,11,209,42,random,-1.3480731,0.415367965
4,Hart & Sutherland,1977,62,13536,248,12619,52,random,-1.4415512,0.020010032
5,Frimodt-Moller et al,1973,33,5036,47,5761,13,alternate,-0.2175473,0.051210172
6,Stein & Aronson,1953,180,1361,372,1079,44,alternate,-0.7861156,0.006905618


### Exercise

Let's hope there are two new columns named `yi` and `vi`. What is `yi`? How can we interpret `yi`? What effect size metric is it in? What is `vi`? 

In [9]:
## Solution.
## yi is the newly calculated effect size, it is a log(relative risk). In terms of interpretation:
## - a negative value means that the vaccine indeed works.
## - yi = 0 means there is no effect at all, yi > 0 means the vaccine is harmful.
## vi is the corresponding variance.

### Exercise

You can retransform the log(RR) to RR by exponentiating (`exp()` function in R) the `yi` column. How can you do that?

In [10]:
## Solution.
exp(dat.bcg$yi)

### Exercise

Next, instead of calculating the relative risks, we want you to calculate the odds ratios. How can you do that? And, again, we want you to interpret `yi`. What effect size metric is it in? What is `vi`? 

In [11]:
## Solution.
library(metafor)
dat.bcg <- escalc(measure = "OR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg)
dat.bcg

trial,author,year,tpos,tneg,cpos,cneg,ablat,alloc,yi,vi
1,Aronson,1948,4,119,11,128,44,random,-0.93869414,0.357124952
2,Ferguson & Simes,1949,6,300,29,274,55,random,-1.66619073,0.208132394
3,Rosenthal et al,1960,3,228,11,209,42,random,-1.38629436,0.433413078
4,Hart & Sutherland,1977,62,13536,248,12619,52,random,-1.45644355,0.020314413
5,Frimodt-Moller et al,1973,33,5036,47,5761,13,alternate,-0.21914109,0.051951777
6,Stein & Aronson,1953,180,1361,372,1079,44,alternate,-0.95812204,0.009905266
7,Vandiviere et al,1973,8,2537,10,619,19,random,-1.63377584,0.227009675
8,TPT Madras,1980,505,87886,499,87892,13,random,0.0120206,0.004006962
9,Coetzee & Berjak,1968,29,7470,45,7232,27,random,-0.47174604,0.056977124
10,Rosenthal et al,1961,17,1699,65,1600,42,systematic,-1.40121014,0.075421726


### Exercise

You can retransform the log(OR) to OR by exponentiating (`exp()` function in R) the `yi` column. How can you do that?

In [12]:
## Solution.
dat.bcg <- escalc(measure = "OR", ai = tpos, bi = tneg, ci = cpos, di = cneg, data = dat.bcg)
exp(dat.bcg$yi)

## Exercise

This exercise addresses issues with coding and handling of effect sizes. The following excerpt is from an article on “Verbal Ability and Teacher Effectiveness” by Andrew et al. 2005. It is a study that was used in the meta-analysis by Aloe and Becker (2009) on “Teacher Verbal Ability and School Outcomes: Where Is the Evidence?”.

![f_andrewetal2005](figure/f_andrewetal2005.png)

### Exercise

Which of these three effect sizes (r) do you think is most appropriate?

In [13]:
## Solution.
## We are interested in verbal abilities and the other two correlation coefficient 
## refer to other subscales/subsets. 
"The first one, r = 0.234"

### Exercise

For a univariate meta-analysis we need effect sizes and their variances (i.e., the squared standard error). Since this effect size is a correlation coefficient, we first need to perform the Fisher’s z transformation. Then, calculate the variance and the standard error for the transformed correlation coefficient.

In [14]:
## Solution.
## Perform a Fisher's z transformation on r for the most general measures,

r <- 0.234
n <- 76
z.r <- 0.5 * log((1 + r)/(1 - r))
z.r

In [15]:
## Solution.
var.z.r <- 1 / (n - 3)
var.z.r

In [16]:
## Solution.
(se.z.r <- sqrt(var.z.r)) ## (...) is a shortcut for print(...)

In [19]:
## Solution.
## You also can use escalc() to calculate the Fisher's r-to-z transformed correlation 
## coefficient.
escalc(measure="ZCOR", ri = 0.234, ni = 76)

yi,vi
0.238417,0.01369863


In [18]:
## Solution.
## Here is the variance and SE for the raw correlation coefficient
(var.r <- (1-r^2)^2/(n-1))
sqrt(var.r)