# Specifying an experiment
#### Max Collard
#### UCSF Neuroscience Orientation 2022

---

# The experiment cycle

<img src="figures/experimental-design/f1.png"/>

## Overall goals

* Start with **what we currently know** about the world.

* **Observe new data**.

* Use this data to **make inferences** that **update** how we view the world.

---

# Components of an experimental design

## Background

How do we currently think the world works?

What part of our model is **relatively clear**?

## Background, *Example*

* Discrete packets of information (*memes*) behave in many respects like an infectious agent.
* Vaccines provide a powerful tool for combating the spread of many infectious agents.
* Information presented to individuals biases the interpretation of information received later.

## Question

What aspect of our model is unclear?

Or: In what way do multiple plausible (i.e., *sufficiently high prior probability*) models make **different predictions**?

## Question, *Example*

Does providing trusted information prior to exposure to misinformation reduce the spread of misinformation?

## Hypothesis

1. What are the models we want to differentiate?

> As a special case, often we want to differentiate two models, **null** and **alternative**. In the null model, there is no difference between two groups in the population we're studying. In the alternative model, there is some difference between the two groups.

## Hypothesis

2. What is our **prior belief** about the probabilities of these models?

> As a special case, we might specify **the most likely model**.

## Hypothesis, *Example*

### Model 1 (Null model)
Prior exposure to trusted informatioin **has no effect** on subsequent sharing of misinformation.

### Model 2 (Alternate model)
Prior exposure to trusted information **has an impact** on subsequent sharing of misinformation.

We believe *a priori* that the **alternate model** is more likely.

## Background / Question / Hypothesis

Here we specify our **prior belief**:

$$ \begin{eqnarray*}
&\mathrm{Pr}(\textrm{model 1})\\
&\mathrm{Pr}(\textrm{model 2})\\
&\mathrm{Pr}(\textrm{model 3})\\
&\cdots
\end{eqnarray*}$$

---

## Experimental approach

How can I construct an apparatus where my models make **different predictions**?

> If our models do not make different predictions, then our experiment gives us **no new information** about the likelihood of the models *a posteriori* (after we get our test results)!

## Experimental approach, *Example*

* Recruit $N$ Facebook users in the United States ("participants").

* Generate fake health-related content claiming avocado oil cures irritable bowel syndrome.

* Expose $N/2$ study participants chosen at random to content showing evidence that avocado oil **does not** cure IBS. (Time $t_\textrm{vax}$)

* At $t_\textrm{vax} + 2$ days ($t_\textrm{exposure}$), expose all participants to the fake content.

* Measure the number of times that the fake content is shared by participants from $t_\textrm{exposure}$ to $t_\textrm{exposure} + 7$ days ($t_\textrm{endnpoint}$).

* Compare the number of shares per person in the group that was pre-treated with contrary information versus the group that was not pre-treated.

## Experimental approach

Here we specify **how our data will be generated** (i.e., how we will conduct our experiment).

This in turn determines the **probability of observing our daata**:

$$\begin{eqnarray*}
& \mathrm{Pr}(\textrm{pattern of results 1}) \\
& \mathrm{Pr}(\textrm{pattern of results 2}) \\
& \mathrm{Pr}(\textrm{pattern of results 3}) \\
& \cdots
\end{eqnarray*}$$

## Expected results

What are the results that you expect to find **assuming your hypothesis**?

More generally, what we're trying to get at is

$$\begin{eqnarray*}
& \mathrm{Pr}(\textrm{pattern of results 1} \mid \textrm{hypothesized model}) \\
& \mathrm{Pr}(\textrm{pattern of results 2} \mid \textrm{hypothesized model}) \\
& \mathrm{Pr}(\textrm{pattern of results 3} \mid \textrm{hypothesized model}) \\
& \cdots
\end{eqnarray*}$$

The **distribution** of how our results should look under our hypothesized model.

The **expected results** are those results that **make this the largest**—that is, the **most likely results** under the hypothesized model.

## Expected results, *Example*

We predict that participants in the pre-treated group will have **fewer shares per person** than participants in the group that did not receive the pre-treatment.

## Interpretation of expected results

What would getting these results tell us about our model?

Or, how do we **update** our beliefs when we get the results we expect?

That is, what are the updated, **posterior probabilities**,

$$ \begin{eqnarray*}
\mathrm{Pr}(\textrm{model 1}) & \rightarrow &\mathrm{Pr}(\textrm{model 1} \mid \textrm{expected test results})\\
\mathrm{Pr}(\textrm{model 2}) & \rightarrow &\mathrm{Pr}(\textrm{model 2} \mid \textrm{expected test results})\\
\mathrm{Pr}(\textrm{model 3}) & \rightarrow &\mathrm{Pr}(\textrm{model 3} \mid \textrm{expected test results})\\
&\cdots
\end{eqnarray*}$$

(The model with the highest value is the **most likely model**, **given** we see what we expect to see.)

### Example

The **positive predictive value** of the proposed experiment is

$$ \mathrm{Pr}(\textrm{hypothesized model} \mid \textrm{expected test results}) $$

**Given** that we see the data that we anticipate seeing in support of our hypothesis, how likely would our hypothesized model be?

Or, how much support does seeing these results lend to our model?

## Interpretation of expected results, *Example*

We predict that participants in the pre-treated group will have **fewer shares per person** than participants in the group that did not receive the pre-treatment.

This would potentially indicate that receiving information **updates users' prior beliefs about the world**, making them less likely to share new information that contradicts that updated belief.

---

# Are there other reasons why we might get these results?

## Randomness

Could we have **seen the expected results**, but under the **null model**, where there is no difference?

That is, what is

$$ \mathrm{Pr}(\textrm{expected test results} \mid \textrm{null hypothesis}) $$

> The $p$-value!

&nbsp;

&nbsp;

&nbsp;

&nbsp;

### Is the null hypothesis the only other model that could generate the expected results?

## Poor control

Could we have **seen the expected results**, but under an **"uninteresting" model**, where the differences are caused by something unrelated to the mechanism we're looking at in our hypothesized model?

### Model 2 (Hypothesized model)
Prior exposure to trusted information **has an impact** on subsequent sharing of misinformation.

### Model 3 ("Uninteresting" model)
Some people share more things than other people in general.

## Good control / "Incisive" experiment

Designing experimental approach such that, if we get the results we expect, our **interpretation** strongly favors one particular model.

That is, our **posterior probability**—our **updated beliefs**—**given** the expected results strongly favors one particular model.

---

## Other patterns of results

What are the **other** ways that your experiment could turn out?

## Other patterns of results, *Example*

Alternatively, participants in the pre-treated group may have a number of shares that **does not statistically differ** from participants in the group that did not receive the pre-treatment.

## Interpretation of other patterns of results

What would getting **these** results tell us about our model?

Or, how do we **update** our beliefs when we get results we **are not** expecting?

That is, what are the updated, **posterior probabilities**,

$$ \begin{eqnarray*}
\mathrm{Pr}(\textrm{model 1}) & \rightarrow &\mathrm{Pr}(\textrm{model 1} \mid \textrm{unexpected test results})\\
\mathrm{Pr}(\textrm{model 2}) & \rightarrow &\mathrm{Pr}(\textrm{model 2} \mid \textrm{unexpected test results})\\
\mathrm{Pr}(\textrm{model 3}) & \rightarrow &\mathrm{Pr}(\textrm{model 3} \mid \textrm{unexpected test results})\\
&\cdots
\end{eqnarray*}$$

(The model with the highest value is the **most likely model**, **given** we see a particular pattern of results.)

### Example

The **negative predictive value** of the proposed experiment is

$$ \mathrm{Pr}(\textrm{null model} \mid \textrm{null results}) $$

What is one reason why the NPV of an experiment might be low?

* Low power.

* Very strong prior. ("I don't believe it.")

---

# A compact example

*Not perfect by any means ...*

FXR knockout leads to intractable seizures in early adulthood with no changes in gross neuronal morphology, suggesting a synaptic mechanism [Question]. One possibility is that, as has been shown for other astrocyte-derived factors [1], astrocytic FXR is needed for proper formation of a certain class of synapses; in its absence, the resulting imbalance leads to epileptogenesis. To test this, I propose using *Aldh1l1*-conditional knockout of FXR to measure its effect on mEPSC and mIPSC frequency and amplitude at P12, when functional synapses are still being formed, and P24, when more mature synapses have formed [1]. I hypothesize that knockout of astrocytic FXR results in an impairment in the development of inhibitory synapses, as measured by mIPSC characteristics. This may suggest that development of seizures in FXR-knockout mice proceeds via a developmental excitatory-inhibitory imbalance.

## Background

FXR knockout leads to intractable seizures in early adulthood with no changes in gross neuronal morphology, suggesting a synaptic mechanism.

## Question / Hypothesis

One possibility is that, as has been shown for other astrocyte-derived factors [1], astrocytic FXR is needed for proper formation of a certain class of synapses; in its absence, the resulting imbalance leads to epileptogenesis.

> The "question" is implicit in the specification of candidate models. The models being considered are
> 1. FXR signaling is not needed for development of proper synaptic function. (**Null**.)
> 2. FXR signaling is needed for development of proper synaptic function. (**Alternate**.)

## Experimental approach

To test this, I propose using *Aldh1l1*-conditional knockout of FXR to measure its effect on mEPSC and mIPSC frequency and amplitude at P12, when functional synapses are still being formed, and P24, when more mature synapses have formed [1].

## Expected results

I hypothesize that knockout of astrocytic FXR results in an impairment in the development of inhibitory synapses, as measured by mIPSC characteristics.

## Interpretation of expected results

This may suggest that development of seizures in FXR-knockout mice proceeds via a developmental excitatory-inhibitory imbalance.

---

# Summary

When we conduct experiments, we collect **data** that **updates** our beliefs about the world:

$$
\begin{eqnarray*}
& \mathrm{Pr}(\textrm{model}) & \\
& \downarrow & \\
& \mathrm{Pr}(\textrm{model} \mid \textrm{data})
\end{eqnarray*}
$$

When you design an experiment, you specify **what data you will collect** and **how that will update your beliefs**.

## Your experimental designs should include:

### Background / Question

### Hypothesis

These establish your **prior belief**.

### Experimental approach

* With specified **measured outcomes** that will be tested

* With specified **controls** that isolate the models being interrogated

This establishes
* **how your data are generated**
* that your data **are informative about your models**

### Patterns of results and interpretations

* Do **"positive" results** support the **hypothesized model**?
    * The evidence your experiment will provide, and what it means
    * i.e., the **PPV**

### Patterns of results and interpretations

* Do **"positive" results** support **other models**?
    * Could this effect be due to chance? (Under null model.)
    * Could this effect be due to confounds? (Under "uninteresting" model.)
    * Could this effect be irrelevant? (Under unlikely, very different model.)
    * Could this effect mean something completely different? (Under unlikely, very different model.)

### Patterns of results and interpretations

* Do **"negative" results** support the **null model**?
    * All experiments are noisy; does a "negative" result necessarily tell us that our hypothesis was wrong?
    * i.e., what is the **NPV**?

* Are there other patterns of results that would sway us in a different direction?

### Patterns of results and interpretations

These establish how the data you generate **update our beliefs**.

<img src="figures/experimental-design/f1.png"/>