# Multivariate Relationships

## Associations and Causality

Causal relationships have an asymmetry: one variable has an influence on the other, but not vice versa.

$$X \rightarrow Y$$

A relationship must satisfy three criteria in order to be considered a causal one:
1. an association between the variables
2. an appropriate time order
   - the cause must temporally precede the effect
   - there should be a time order between the variables under consideration
3. the elimination of alternative explanations
   - there cannot be any other explanations for why we might be observing the relationship
   - we can never prove causality because an alternative explanation may simply be unobserved
   
> the most demanding criteria is *the elimination of alternative explanations*

## Controlling for Other Variables

A fundamental component of evaluating whether X causes Y involves ruling out alternative explanations.

We do this by studying whether the relationship between X and Y persists after removing the influence of other variables on this association. In other words, we **control** for the variables that offer an alternative explanation.

**Experimental control**: the ability to keep a variable at a fixed value, e.g. temperature or atmospheric pressure in a chemistry or physics experiment

**Statistical control**: grouping sample results based on the value (or value range) of respondents, e.g. education level (less than high school degree, high school degree, some college, completed 2-year or 4-year degree, etc.) or income.

Be wary of **lurking variables**: a variable that is not measured in a study but influences the association of interest.

## Types of Multivariate Relationships

### Spurious Relationships
An association between $X_{1}$ and $Y$ is said to be a **spurious relationship** if both are dependent on a third variable $X_{2}$, and their association disappears when $X_{2}$ is controlled.

### Chain Relationships
In **chain relationships**, the main independent variable of interest $X_{1}$ is only a distal cause of the outcome $Y$ with $X_{2}$ being an intervening variable, which is itself caused by $X_{1}$. $X_{2}$ is then hypothesized to cause $Y$.

### Multiple Causes
This occurs if a certain outcome $Y$ has multiple causes $X_{1}$ abd $X_{2}$ simultaneously. These causes may be independent of each other. 

For example, if $X_{1}$ = *gender*, and $X_{2}$ = *race*, both have effects on delinquency such that delinquency rates vary both according to gender and race, then $X_{1}$ and $X_{2}$ are likely to be independent causes.

### Suppressor Variables
Occasionaly two variables show no association until a third variable is controlled.

### Statistical Interaction
**Statistical interaction** exists between $X_{1}$ and $X_{2}$ in their effects on $Y$ when the true effect of one predictor on $Y$ chnages as the value of the other predictor changes.

To asses whether a sample shows evidence of interaction, we can compare the effect of $X_{1}$ on $Y$ at different levels of $X_{2}$. When the sample effect is different at each level of $X_{2}$, there is evidence of interaction of $X_{1}$ and $X_{2}$ in their effect on $Y$.