# Definitions

### Continuous versus categorical outcomes

Measures of association between two or more variables can take many forms (see [here](https://pmc.ncbi.nlm.nih.gov/articles/PMC4560542/) for one of many available guides for deciding between them). Here we emphasize a key factor that must be used when determining which of these measures to apply to your data: whether your outcome variable is **continuous** (i.e., can in principle take on any value within some range, like a length or spike rate), in which case appropriate tests include forms of [regression](https://github.com/PennNGG/Quantitative-Neuroscience/blob/master/Measures%20of%20Association/Linear%20Regression.ipynb) and [parametric](https://github.com/PennNGG/Quantitative-Neuroscience/blob/master/Measures%20of%20Association/Parametric%20Correlation%20Coefficient.ipynb) or [nonparametric](https://github.com/PennNGG/Quantitative-Neuroscience/blob/master/Measures%20of%20Association/Nonparametric%20Correlation%20Coefficient.ipynb) measures of correlation, or **categorical** (i.e., corresponds to distinct categories), in which case appropriate tests include the [chi-squared test]().

### Correlation versus regression

Correlation and regression both measure linear relationships between continuous variables, but they are different: in correlation, you sample both measurement variables randomly from a population (e.g., weight and height), whereas in regression, you choose or fix the values of the independent variable(s) (e.g., amount of ice cream provided as an experimental manipulation). These are subtle but important distinctions. And, of course, just because things are correlated, doesn't mean they are causally related. Indeed, there may be another (unknown) variable that is actually the causal mechanism that affects the two values that you are measuring. One always need to be careful not to attribute causality to an association or correlation: remember the [frog story](http://www.terrificscience.org/lessonpdfs/Frog_Experiment.pdf)!

There are also other analyses that don't look at linear relationships but can ask simply for monotonic relationships between two variables or can assume that the data are not normally distributed (e.g., a Spearman correlation).

Finally, one can also test for non-linear relationships, in which maybe you hypothesize that predictors $X$ and measurements $Y_{pred}$ have a squared relationship: $Y_{pred}=a+bX^2$

# Additional Resources


- Differences between correlation and regression are discussed [here](https://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/11-correlation-and-regression) and [here](http://www.biostathandbook.com/linearregression.html).

- A reference on [how to choose the appropriate measure of association](https://journals.sagepub.com/doi/pdf/10.1177/8756479308317006) (Khamis 2008).

# Credits

Copyright 2021 by Joshua I. Gold, University of Pennsylvania