## Lets talk about mediation
Test whether effect of variable X on variable Y is in part explained by chain of effects through X on intervening mediator variable M, and of M on Y (X --> M --> Y).

**Total effect** is the effect of X on Y

An **Indirect effect** is the quantified estimated difference in Y resulting from one-unit change in X through the its effects on M. Significance arises from a CI that does not include 0. Bootstrapping the indirect effect is ideal, as simulation work has shown the distribution of indirect effects is non-normal.
Sometimes called Average Causal Mediation Effect (ACME).

**Direct effect** is the effect of X on Y, accounting for the effect of M on Y. Sometimes called Average Direct Effects (ADE).


A mediation model is a set of regressions, and you can have multiple unrelated mediators for a given model.


![image](medexample0.png)

### Total Effect (c)
$$Y = b_0 + b_1X + e$$
$$ c = b_1 $$

### Direct Effect (c')
$$Y = b_0 + b_2X + b_3M + e$$
$$ c' = b_2$$

### Indirect Effect (ab)
$$c = ab + c'$$
$$ ab = c - c'$$

---
## Interpreting results

If `ab` is significant, then there is a significant indirect effect.

In addition, if `c` is significant and `ab` is significant, M may be a _full_ or _partial_ mediator:
- If `c'` **is not significant**, M is a _full_ mediator.
- If `c'` **is significant**, M is a _partial_ mediator.

Be open to interpreting indirect effects,as long as there is a good _a priori_ theoretical reason for relating the variables. Below is a workflow designed to test for mediation, but if you follow the "No's" in this chart you will miss interesting intervening effects

### Suggested workflow for mediation
![img](https://data.library.virginia.edu/files/mediation_flowchart-1.png)

In [None]:
import os
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import pingouin as pg
from fc_config import *
from wesanderson import wes_palettes
from nilearn.input_data import NiftiMasker
from glm_timing import glm_timing
from mvpa_analysis import group_decode
from signal_change import collect_ev
from corr_plot import corr_plot
from scipy.stats import linregress as lm
#initialize seaborn parameters

sns.set_context('notebook')
sns.set_style('whitegrid')

In [None]:
df = pg.read_dataset('mediation') 
df.head()

In [None]:
# total path
pg.linear_regression(df.X,df.Y)

In [None]:
# a path
pg.linear_regression(df.X,df.M)

In [None]:
# b path
pg.linear_regression(df.M,df.Y)

In [None]:
# c' (direct effect)
pg.linear_regression(df[['X','M']],df.Y)

In [None]:
c = pg.linear_regression(df.X,df.Y)['coef'][1] # total effect
c1 = pg.linear_regression(df[['X','M']],df.Y)['coef'][1] #direct path

In [None]:
#c = c1 + (a*b)
# (a*b) = c - c1
ab = c - c1 # indirect effect
print(ab)

In [None]:
pg.mediation_analysis(x='X',m='M',y='Y',data=df)

![image](fullmed.png)

In [None]:
#load in data
ev = pd.read_csv(os.path.join(data_dir,'graphing','signal_change','mvpa_ev.csv'))
rb = pd.read_csv(os.path.join(data_dir,'graphing','signal_change','run004_beta_values.csv'))

In [None]:
rmod = pd.DataFrame([])
rmod['ev'] = ev.ev
rmod['vmPFC'] = rb.early_CSp_CSm[rb.roi == 'mOFC_beta'].values
rmod['HC'] = rb.early_CSp_CSm[rb.roi == 'hippocampus_beta'].values
rmod['amyg'] = rb.early_CSp_CSm[rb.roi == 'amygdala_beta'].values
rmod['group'] = np.repeat(('control','ptsd'),24)
rmod['bgroup'] = np.repeat((0,1),24)
crmod = rmod[rmod.group == 'control']
prmod = rmod[rmod.group == 'ptsd']

In [None]:
crmod.head()

In [None]:
#corr_plot(crmod,'control')

In [None]:
c_res, c_dist = pg.mediation_analysis(x='ev',m=['vmPFC','HC'],y='amyg',
                                      data=crmod,n_boot=5000,return_dist=True)

In [None]:
c_res

In [None]:
#pg.linear_regression(crmod[['ev','vmPFC','HC']],crmod.amyg)

In [None]:
fig, ax = plt.subplots()
ax = sns.distplot(c_dist[:,0],color='blue',label='vmPFC')
ax = sns.distplot(c_dist[:,1],color='red',label='HC')
ax.set_title('Bootstrapped indirect effects')

In [None]:
pg.normality(c_dist[:,1])

In [None]:
total_indirect = np.sum(c_dist,axis=1)
#np.mean(total_indirect)
fig, ax2 = plt.subplots()
ax2 = sns.distplot(total_indirect,color='purple',label='total')

In [None]:
corr_plot(prmod,'PTSD')

In [None]:
p_res, p_dist = pg.mediation_analysis(x='ev',m=['vmPFC','HC'],y='amyg',
                                      data=prmod,n_boot=5000,return_dist=True)

In [None]:
p_res

In [None]:
fig, ax = plt.subplots()
ax = sns.distplot(p_dist[:,0],color='blue',label='vmPFC')
ax = sns.distplot(p_dist[:,1],color='red',label='HC')
ax.set_title('Bootstrapped indirect effects')

In [None]:
ptsd_total_indirect = p_dist[:,0] - p_dist[:,1]
fig, ax2 = plt.subplots()
ax2 = sns.distplot(ptsd_total_indirect,color='purple',label='total')