<img src="../../shared/img/slides_banner.svg" width=2560></img>

# 08a - Effects and Interactions in Multiway Models

In [None]:
import sys

sys.path.append("../../")

from shared.src import quiet
from shared.src import seed
from shared.src import style

In [None]:
from pathlib import Path
import random

from IPython.display import HTML, Image
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import pymc3 as pm
import seaborn as sns
import scipy.stats

In [None]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
import seaborn as sns

import scipy.stats

In [None]:
sns.set_context("notebook", font_scale=1.7)

In [None]:
import shared.src.utils.util as shared_util

In [None]:
def retrieve_groupbys(df, by, column="X"):
    _, gbs =  zip(*list(df.groupby(by=by)[column]))
    return gbs

def make_plot(mus, ax=None, **plot_kwargs):
    if ax is None:
        f, ax = plt.subplots(figsize=(12, 6))
    xs = np.arange(mus.shape[0])
    ax.plot(xs, mus[:, 0], lw=4, color="C0", **plot_kwargs)
    ax.plot(xs, mus[:, 1], lw=4, color="C1", **plot_kwargs)

from matplotlib.lines import Line2D

def make_line(color, linewidth=4):
    return Line2D([0], [0], linewidth=linewidth, color=color)

# Previously, we worked with data that varied only along one category, or factor.

For example:
- how does a participant's performance on a task vary depending on whether they are given caffeine or not?
- how do a participant's brain activity patterns vary with the type of music they are listening to?

## In these lectures, we will learn how to work with data that varies along two or more categories, which might _interact_.

For example:
- how do whether a participant drinks caffeine and whether they're over 40 interact to determine their performance on a task?
- if we show a person movies inside an fMRI machine, how do both the sound and the image being presented together determine the pattern of activity in their brain?

# Let's start by considering how we might make a detailed mechanistic model of an experiment.

That is, we want to describe _exactly_ the process by which each data point is generated.

When we run an experiment,
we typically only record and control only a very small subset
of all the variables we _could_ record and control,
and which _in principle_ have an effect on each other and on the variable we are interested in measuring.

## Physics Example: Measuring weight of an object

If we measure the weight of an object more than once,
we find that what we observe varies from measurement to measurement.

That's in part because we usually ignore all of the following:
- Air pressure
- Temperature
- Gravitational pull of the moon
- Gravitational pull of the sun and other heavenly bodies
- Changes in the stiffness of the springs inside our scale

## Psychology Example: Measuring a brain signal in response to hearing a word

If we perform an experiment where we repeatedly measure a brain signal,
e.g. the electrical signal of an EEG (aka brain wave)
or the magnetic signal of an fMRI,
in response to the same stimulus,
e.g. a spoken word,
we'll see that the signals vary,
both from person to person and within the same person.

That's in part because we are ignoring all of the following factors:

- How closely they are paying attention
- Their bodily state - hungry, sleepy, overheated
- The different indviduals' histories with that word
- The precise orientation of their skull relative to our probes
- The intonation of the word
- The behavior of individual brain cells
- Variability in our equipment

## In one view, it is our ignorance of these factors that leads to what we call randomness.

The laws of physics are deterministic,
excepting certain interpretations of quantum mechanics,
meaning that, in principle,
once certain values are known
(position, velocity, mass, charge, etc.)
for all of the pieces of the system,
then there is nothing left to chance.

And at the scale that most phenomena of interest to humans happen,
quantum effects are negligible.

Therefore when we say our data is random, we _must_ be cheating a little bit.

## That is, _randomness_ is just code for _things I don't know_.

# Imagine an experiment where there are exactly 12 categorical factors that influence the measured value.

That is,
once one knows the values of each of the 12 factors,
the final value of the measurement is _fully determined_.

That is, it is deterministic, rather than random.

## Assume further that each factor is binary: it is present or not.

## Lastly, let's say every factor, when present, adds a certain amount to the measurement, which we call the _size_ of the _effect_ of the factor.

In [None]:
factor_effect_sizes = list(sorted(pm.Uniform.dist(lower=-1, upper=1).random(size=12)))

factor_effect_sizes

In [None]:
# factor_effect_sizes = [-3] + factor_effect_sizes + [6]

## To make this work with pyMC, let's say that on any given trial, which effects are present is _random_.

This is an example of using `pyMC` to simulate a system,
as opposed to using `pyMC` to compute posteriors.

In [None]:
with pm.Model() as many_effects_model:
    effects_present = pm.Bernoulli("effect_present", p=0.5, shape=len(factor_effect_sizes))
    
    sum_of_effects = 0
    for ii in range(len(factor_effect_sizes)):
        sum_of_effects += factor_effect_sizes[ii] * effects_present[ii]
        
    observed_data = pm.Deterministic("X", sum_of_effects)

How to make this more realistic:
- Not all factors are binary and have equal chance - could switch to `Categorical`
- Many factors are related to each other (we'll see more on that today)
- Not all effects are discrete! (we'll talk about that next week)

In [None]:
with many_effects_model:
    many_effects_trace = pm.sample(draws=500, chains=10)

    many_effects_df = pm.trace_to_dataframe(many_effects_trace)

In [None]:
many_effects_df.head()

## If we know which factors are present and absent, the data looks deterministic.

The next block of code finds all the rows that are equal to a given row
using `apply` on the `row_equal_to` function defined below.

Not all rows will be duplicated,
so if the result printed by this cell has only one row,
try changing the `row_index`.

If you inspect the `effect_present` columns in the output of the cell,
you'll see that the values are the same in all the printed rows.
Furthermore, the value of the `X` column is also equal.

In [None]:
def row_equal_to(row, other_row):
    return all(row[:-1] == other_row[:-1])

row_index = 1
equal_to_row = many_effects_df.apply(row_equal_to, axis=1,
                                     other_row=many_effects_df.iloc[row_index])

many_effects_df[equal_to_row]

And so if we were to group our data by all of these columns simultaneously
and then look at the histogram of `X`, the result would just be a single point:
there would be no "distribution" of `X`.

## Despite the deterministic nature of our data, if we ignore which factors are present, we obtain a familiar-looking distribution for the `X` values.

That is,
we pretend that we didn't measure the values of the `effect_present` variables
and then look at the distribution.

This simulates the realistic setting where we measure the outcome variable
but not all of the factors that determine it.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
xs = np.linspace(min(many_effects_df["X"]), max(many_effects_df["X"]))
ps = scipy.stats.norm.pdf(
    xs, loc=many_effects_df["X"].mean(), scale=many_effects_df["X"].std())

sns.distplot(many_effects_df["X"], label="Observed");
ax.plot(xs, ps, lw=4, label="Closest Gaussian"); ax.legend();

## The fact that this distribution is bell-shaped is a case of the _Central Limit Theorem_.

Whenever our measurement is subject to a large number of non-interacting effects
of about the same size, the distribution we observe of measurement values
if we allow those effects to vary is a normal distribution.

If some of the effects are much larger than the others,
then the Central Limit Theorem holds much more loosely:
we need more and more interfering effects to end up with a bell curve.

It is often the case that some effects are much larger and more important than others:
in science, we rely on this to make simple models.

To see what this kind of looks like, execute the commented out cell skipped above
that adds two new factors to the data, each larger in magnitude than the others,
then re-run the cells above.
You'll see that the data is no longer distributed normally.
If you also run the cells below

## We can _estimate_ the effect sizes by grouping and taking averages.

In [None]:
effect_columns = many_effects_df.columns[:-1]; factor_index = -1
group_means = many_effects_df.groupby(effect_columns[factor_index])["X"].mean()
group_means

In [None]:
group_means[1] - group_means[0]

In [None]:
factor_effect_sizes[factor_index]

## If we group on one of effects and then plot, we still see data that looks random.

In [None]:
column = effect_columns[-1]
gbs = retrieve_groupbys(many_effects_df, by=column)

f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x=column, y="X", data=many_effects_df, ax=ax, width=0.5, linewidth=4);

scipy.stats.f_oneway(*gbs)

This type of plot is known as a [Violin Plot](https://seaborn.pydata.org/generated/seaborn.violinplot.html),
for the resemblance to the musical instrument.

The "lumpy" portion of the plot is a kernel density estimate, as in `distplot`.
The only difference is that the density is mirrored,
on the left and right of the box in the center.
The box in the center is a boxplot:
the median and the 25th and 75th percentile are shown.

## Our modeling tools are designed to try and manage the uncertainty that our ignorance of the unmeasured factors introduces.

## This remains true if we group on a small number of effects, relative to the total.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x=effect_columns[-1], y="X", hue=effect_columns[-2], data=many_effects_df, ax=ax, width=0.5,
               linewidth=4);

# In a real experiment, we would typically only measure a small handful of the influencing factors at most: say, 2.

That is, treat the model above as the _true model_,
one that accurately describes our data-generating process.

In real life,
we usually don't know this model:
we don't know the effect sizes,
we don't even know the identities of the factors!
(and there are actually infinite, or at least an extremely large number of, possibilities
for factors).

And furthermore, we don't typically measure everything of relevance.
We identify, based on our intuition or on previous results,
factors that we think are important.

So the data we actually observe, in a real experiment, looks more like:

In [None]:
observed_data_df = pd.DataFrame()

factor1_idx = 0
factor2_idx = -1

observed_data_df["factor1"] = many_effects_df[effect_columns[factor1_idx]]
observed_data_df["factor2"] = many_effects_df[effect_columns[factor2_idx]]

observed_data_df["measurement"] = many_effects_df["X"]

That is, we do not have access to the values of the other factors:
they are not columns in our `DataFrame`.

Because the factors are ordered by their effect size, from most negative to most positive,
setting the indices to 0 and -1 presumes we identifed the factors with the largest effects.

Try setting them to different values to see what happens when we try to build models of data
where some of the most important factors are left out.

In [None]:
print(observed_data_df.head())

Always think of your data this way:
the tip of the iceberg,
or as a "slice" through what you should be observing.

From this dataframe,
which represents what we might actually observe in an experiment,
we can produce the kinds of plots we've made for real data.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x="factor1", y="measurement", data=observed_data_df, linewidth=4);

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x="factor2", y="measurement", data=observed_data_df, linewidth=4);

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x="factor1", y="measurement", hue="factor2", data=observed_data_df, linewidth=4);

But, just as we cannot use our data to
we cannot see the deterministic nature of the true model in `observed_data_df`,
neither can we recover the exact values of the effect sizes.

## If we specify a model, we can use its posterior to determine the likely values for the effect sizes.

In [None]:
# first part of prior: describing uncertainty in the group means

with pm.Model() as synthetic_data_model:
    mus = pm.Normal("mus", mu=0, sd=1e2, shape=(2, 2))

The new piece here is in the `shape` argument:
now, the shape argument is getting a tuple of values, `shape=(2, 2)`,
rather than a single value, e.g. `shape=3`.

Previously, when shape had only one value,
we though of `mus` as a list.

Technically, in that case and in this one,
`mus` is something called a `Tensor`,
from the `theano` library.

In this case, you can think of it like a list of lists:

In [None]:
list_of_lists = [[0, 1], [2, 3]]
list_of_lists

Lists of lists are like `DataFrame`s:
each "inner" list is a row of the dataframe.

To access all the values in a column,
we'd want a value from each "inner" list
that corresponds to a given index.

The interpretation of `mus` is still the same:
it's just a random variable that holds values for
each of the groups means.

In [None]:
# second part of prior: describing uncertainty in the standard deviation

with synthetic_data_model:
    sd = pm.Exponential("sigma", lam=0.1)

In [None]:
# likelihood: if we knew the means and sds, what would be our remaining uncertainty?

with synthetic_data_model:
    # where does the uncertainty represented by this Normal come from?
    #  from things we're not measuring and modeling
    observations = pm.Normal("observations",
                             mu=mus[observed_data_df["factor1"], observed_data_df["factor2"]],
                             # implementation detail: we use _two_ Series to index mus now
                             #  the choice of order means the _rows_ of mu are different levels of factor1
                             #  while the _columns_ of mu are different levels of factor2
                             sd=sd, observed=observed_data_df["measurement"])

To determine which group we're in, we need to know our position in both the "outer" and the "inner" list:
we need to index into `mus` based on `factor1` _and_ on `factor2`.

That is, there is a different mean for each level of each factor,
so to determine the correct mean for each datapoint,
we need to know which level it was in in each factor.

Notice: this isn't a _mechanistic_ model of our data,
or at least not a complete one.

We know that the real mechanistic model of our data
is the `many_effects_model` above.

Instead, we say that some of the mechanisms in our system
we are going to approximate with a Normal likelihood.

$$
\mu[i, j] \sim \text{Normal}(0, 1\mathrm{e}2)\\
\sigma \sim \text{Exponential}(0.1)\\
d \sim \text{Normal}(\mu[i, j], \sigma)
$$

The notation here is meant to evoke the syntax we use to index into `DataFrame`s with `iloc`,
which is the same syntax we use to index into arrays, like `mus` in this case.
More on that below.

In [None]:
with synthetic_data_model:
    synthetic_trace = pm.sample()
    synthetic_posterior_samples = shared_util.samples_to_dataframe(synthetic_trace)

In [None]:
print(synthetic_posterior_samples.head())

Notice that the sampled values of `mus` also look like lists-of-lists:

In [None]:
synthetic_posterior_samples["mus"].iloc[0]

But they are _not_ actually lists:
they are `arrays`,
provided by the `numpy` library, alias `np`.

In [None]:
type(synthetic_posterior_samples["mus"].iloc[0])

They are also like `DataFrames`, in that we can use
the indexing syntax, `[...]`, to access their contents.

However unlike `DataFrames`,
`arrays` only have one style of indexing,
which is equivalent to `iloc`.

In [None]:
print(synthetic_posterior_samples.head())

In [None]:
synthetic_posterior_samples.iloc[1, 0]   # entry in second row of first column of DataFrame

In [None]:
example_array = synthetic_posterior_samples.iloc[1, 0] 

example_array, example_array[1, 0]  # entry in second row of first column of array

In [None]:
example_array, example_array[:, 0]  # entries in all rows of first column of array (result is 1-D array)

Also unlike `DataFrames`, arrays can have more (or less) than two dimensions.

See [this tutorial for more on numpy and arrays](https://hackernoon.com/introduction-to-numpy-1-an-absolute-beginners-guide-to-machine-learning-and-data-science-5d87f13f0d51).

## We then estimate the effects of factors from the entries of the `mu` array on each sample.

First, let's look at the mean when both factors are present and when they are absent.

In [None]:
def get_mean_both_factors_absent(row):
    mus = row["mus"]
    return mus[0, 0]  # factor1=0, factor2=0

def get_mean_both_factors_present(row):
    mus = row["mus"]
    return mus[1, 1] # factor1=1, factor2=1

In [None]:
mean_both_present = synthetic_posterior_samples.apply(get_mean_both_factors_absent, axis=1)
mean_both_absent = synthetic_posterior_samples.apply(get_mean_both_factors_present, axis=1)

f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(mean_both_present - mean_both_absent); ax.set_xlim(-1.5, 1.5);

But this doesn't tell us what either factor does separately.

To do that, we need a bit more work:

First, we need to specify one combination of factor values as a baseline.

For this data, the natural choice for baseline is when
both factors are absent: `mus[0, 0]`.

We compare the mean when only one of the factors is present to this baseline.

In [None]:
def compute_delta_factor1(row):
    mus = row["mus"]
    
    baseline = mus[0, 0]
    # remember: rows differ by value of factor 1, columns by value of factor 2
    factor1_present = mus[1, 0]
    
    return factor1_present - baseline

This represents the effect of factor 1 _in the absence of factor 2_.

In [None]:
delta_factor1_posterior = synthetic_posterior_samples.apply(compute_delta_factor1, axis=1)

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(delta_factor1_posterior, label="Posterior", axlabel="Factor 1 Effect, Factor 2 = 0");
ax.vlines(factor_effect_sizes[factor1_idx], 0, 4, lw=4, label="True Value");
ax.legend(); ax.set_xlim(-1.5, 1.5);

We can do the same for factor 2.

In [None]:
def compute_delta_factor2(row):
    mus = row["mus"]
    
    baseline = mus[0, 0]
    factor2_present = mus[0, 1]
    
    return factor2_present - baseline

In [None]:
delta_factor2_posterior = synthetic_posterior_samples.apply(compute_delta_factor2, axis=1)

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(delta_factor2_posterior, label="Posterior", axlabel="Factor 2 Effect, Factor 1 = 0");
ax.vlines(factor_effect_sizes[factor2_idx], 0, 4, lw=4, label="True Value");
ax.legend(); ax.set_xlim(-1.5, 1.5);

## But this is only the effect of factor 2 _given_ factor 1 is 0.

The problem is that we don't know whether the effect of factor 2 is different
depending on whether factor 1 is present or not.

For example: what is the "effect" of drinking orange juice on your taste sensation?

Normally, the effect is that you taste something sweet and delicious.

But in the presence of toothpaste,
the effect is that you
[taste something bitter instead](https://health.howstuffworks.com/mental-health/human-nature/perception/orange-juice-toothpaste.htm).

So we have to be very specific about what we mean about "the effect" of a factor,
especially when those factors are inter-related.

Let's calculate our posterior for the effect of factor 2 when factor 1 _is_ present (`== 1`)
and compare it to the posterior we obtained previously for the effect of factor 2
when factor 1 _is not_ present (`== 0`).

In [None]:
def compute_delta_factor2_factor1_present(row):
    mus = row["mus"]
    
    baseline = mus[1, 0]
    factor2_present = mus[1, 1]
    
    return factor2_present - baseline

In [None]:
delta_factor2_factor1_present_posterior = synthetic_posterior_samples.apply(
    compute_delta_factor2_factor1_present, axis=1)

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(delta_factor2_posterior, label="Factor 2 Effect, Factor 1 = 0");
sns.distplot(delta_factor2_factor1_present_posterior, label="Factor 2 Effect, Factor 1 = 1");
ax.vlines(factor_effect_sizes[factor2_idx], 0, 4, lw=4, label="True Value");
ax.legend(); ax.set_xlim(-1.5, 1.5);

The posteriors overlap almost perfectly,
indicating that we don't expect the measurement to change differently
when factor 2 is changed _depending on_ the value of factor 1.

The plot below visualizes this directly:
it is the posterior for the _difference_ in
the values for the change in means induced by
the presence of factor 2 in the presence and absence of factor 1.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(delta_factor2_posterior - delta_factor2_factor1_present_posterior,
             axlabel="Difference in Factor 2 Effect across Levels of Factor 1");
ax.vlines(0, 0, 4, lw=4, label="True Value");
ax.legend(); ax.set_xlim(-1.5, 1.5);

Notice that the difference appears to be basically 0.

# This is what we mean by _non-interacting effects_.

The effect of one factor is not dependent on the value of the other factor.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.violinplot(x="factor1", y="measurement", hue="factor2", data=observed_data_df, linewidth=4);

Draw lines between the means: they'll be parallel.
Or, alternatively, the two pairs of "violins",
one pair for `factor1 = 0` and one pair for `factor1 = 1`,
are just shifted relative to one another.

The seaborn function `pointplot` directly displays the means of different groups
(by default, as circles)
and connects them with lines.

If the slopes of these lines are different,
then the effect of factor 1 is different depending on the value of factor 2.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.pointplot(x="factor1", y="measurement", hue="factor2", data=observed_data_df, linewidth=4);

You might notice vertical lines sticking out from the means.
These are an estimate of the uncertainty in the mean.
By default, `pointplot` uses bootstrapping to estimate this uncertainty.

In order to determine whether there is an effect from the `pointplot`,
you need to mentally compare the heights of those bars
to the differences in slopes of the lines.

Alternatively, you could bootstrap the `pointplot`:
draw a bootstrap sample from the data and recreate the `pointplot`.
If the difference in slopes persists on most bootstraps,
then it's likely not due to chance.

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
[sns.pointplot(x="factor1", y="measurement", hue="factor2",
              data=observed_data_df.sample(frac=1, replace=True), linewidth=4)
 for _ in range(100)]; ax.legend().remove();

We can instead use our posteriors over the means and look for the same pattern:
do the slopes of these lines look different on a large fraction of the samples?

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
[make_plot(synthetic_posterior_samples.iloc[ii]["mus"], ax=ax, alpha=0.01)
 for ii in random.sample(range(len(synthetic_posterior_samples)), 100)];
ax.set_xticks([0, 1]); ax.set_xlabel("factor1");
ax.set_ylabel("Group Average"); ax.set_xlim([-0.5, 1.5])
ax.legend([make_line("C0"), make_line("C1")], ["factor2 = 0", "factor2 = 1"]);

## When two factors _interact_, they are more than the sum of their parts.

Literally:
if there is _no interaction_,
we can guess the mean when both factors appear together
by estimating the effect of the two factors separately.

This is the same thing as saying that the two lines above have different slopes.

In [None]:
def compute_interaction_effect(row):
    mus = row["mus"]

    baseline = mus[0, 0]
    prediction_from_separate = baseline + \
        compute_delta_factor1(row) + \
        compute_delta_factor2(row)
    
    actually_observed_effect = mus[1, 1]
    
    return actually_observed_effect - prediction_from_separate

`compute_interaction_effect` is computing the same thing (up to a minus sign)
as `delta_factor2_posterior - delta_factor2_factor1_present_posterior`,
but in a different way.

In [None]:
interaction_effects = synthetic_posterior_samples.apply(
    compute_interaction_effect, axis=1)

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.distplot(interaction_effects, label="Posterior", axlabel="Interaction Effect");
ax.vlines(0, 0, 4, lw=4, label="True Value");
ax.legend(); ax.set_xlim(-1.5, 1.5);

In [None]:
(interaction_effects > 0).mean()

# But many real-life factors do interact.

## For example: closing each eye while firing a bow and arrow at a target.

Trying to predict what happens when you close _both_ your eyes
by just adding together what happens when you close _either_ eye doesn't work.

Let's quickly connect this back to our mechanistic model:
what are some factors that determine accuracy that we aren't considering?

In [None]:
with pm.Model() as accuracy_model:
    left_eye_closed = pm.Bernoulli("left_eye_closed", p=0.5)
    right_eye_closed = pm.Bernoulli("right_eye_closed", p=0.5)
    
    accuracies = shared_util.to_pymc([[0.8, 0.73],  # notice: a list of lists
                                      [0.73, 0.1]])
    
    target_hit = pm.Bernoulli("target_hit", p=accuracies[left_eye_closed, right_eye_closed])

`shared_util.to_pymc` converts the argument to a type of `theano.Tensor` so that it can be used in a pyMC model.

In [None]:
with accuracy_model:
    accuracy_trace = pm.sample()
    accuracy_df = pm.trace_to_dataframe(accuracy_trace)

In [None]:
accuracy_df.groupby(["left_eye_closed", "right_eye_closed"]).mean()

In [None]:
f, ax = plt.subplots(figsize=(12, 6))
sns.pointplot(x="left_eye_closed", y="target_hit", hue="right_eye_closed",
            data=accuracy_df, linewidth=4);

Look at the lines between the means:
they are very much _not_ parallel.

## Be careful interpreting effects of factors when there are possible interactions present.

Consider what this means for the question of "what is the _effect_ of closing your left eye on accuracy"?

The answer is: it depends very much on whether your right eye is open or not!

When interaction effects are present,
we need to be careful about making claims regarding the "effect" of any factors that interact.

In [None]:
f, axs = plt.subplots(figsize=(12, 12), nrows=2, sharex=True, sharey=True)
sns.pointplot(x="left_eye_closed", y="target_hit", hue="right_eye_closed",
            data=accuracy_df, linewidth=4, ax=axs[0]);
sns.pointplot(x="left_eye_closed", y="target_hit",
            data=accuracy_df, linewidth=4, ax=axs[1]);

The second plot is what we would observe if we ignored whether the right eye was open or not:
we'd decide that, with high certainty,
the "effect of closing your left eye" was to decrease accuracy to 0.5,
even though this is very far from the whole story.

### This is a serious problem for studies of complex systems, like living cells, human bodies and minds, and economies.

As suggested by the mechanistic model above,
we typically don't and can't measure everything that impacts a system.

The systems we describe as complex typically have many factors
that interact, and powerfully, in various ways to determine outcomes.

An example: doctors frequently recommend decreased salt intake to decrease the risk of cardiovascular disease by reducing blood pressure.

However, for about 15% individuals,
decreasing salt intake actually _increases blood pressure_
([see Figure 1 in this article from the American Heart Association](https://www.ahajournals.org/doi/full/10.1161/hyp.0000000000000047)).
This is likely to be due to a combination of genetic and environmental factors.

These individuals are less numerous and the decreases are smaller,
so they are "washed out" in the average by the folks whose blood pressure decreases.

It is common to say,
even in the presence of these facts,
that "the effect of decreasing salt intake" is to reduce blood pressure,
but this is only true _on average_.
When we take other factors into account,
the effects can be very different.