# Effect Size

- Statistical hypothesis tests focus on determining whether observed results are likely given certain assumptions, such as the absence of a relationship between variables or no distinction between groups.
- However, these tests do not provide information about the magnitude of the effect when an association or difference is deemed statistically significant.
- This underlines the importance of having standardized methods for calculating and reporting the size of an effect in experimental results.

- Effect size methods encompass a range of statistical tools designed to quantify the magnitude of an effect in experimental findings. These tools are valuable complements to the outcomes of statistical hypothesis tests.

In this tutorial, you will gain insights into the concept of effect size and learn about various effect size measures for assessing the magnitude of results. By the end of this tutorial, you will:

- Understand the significance of calculating and reporting effect size in experimental outcomes.
- Familiarize yourself with effect size measures that quantify associations between variables, such as Pearson’s correlation coefficient.
- Explore effect size measures that gauge differences between groups, like Cohen’s d measure.


---


## The Importance of Reporting Effect Size

- As practitioners become well-versed in statistical methods, the focus often shifts towards quantifying the likelihood of a result.
  - This is evident in the emphasis on calculating and presenting statistical hypothesis test results, primarily in terms of p-values and significance levels.

- However, one crucial aspect that frequently goes overlooked in result presentations is the quantification of the actual difference or relationship, commonly referred to as the "effect."
  - It's important to remember that the core objective of an experiment is to measure an effect.

- In fact, the primary outcome of a research investigation should be one or more measures of effect size, rather than just p-values.

  > "The primary product of a research inquiry is one or more measures of effect size, not P values."
  > — Things I have learned (so far), 1990.

- While statistical tests can provide insights into the likelihood of an effect's existence, they do not offer insights into the size of that effect.
  - It's entirely plausible for an experiment's results to be statistically significant but reveal an effect so minuscule that it holds little practical significance.

- Conversely, results can also be statistically non-significant while holding vital importance.

  > "It is possible, and unfortunately quite common, for a result to be statistically significant and trivial. It is also possible for a result to be statistically nonsignificant and important."
  > — Page 4, The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results, 2010.

- Neglecting the presentation of the effect size can lead to its ad hoc calculation or even complete omission, leaving the reader to interpret the results without crucial information.
  - Therefore, quantifying the size of the effect is indispensable for a comprehensive interpretation of research outcomes.


---


## What is Effect Size?

- Effect size, in the context of statistics, refers to the magnitude or size of an observed effect or result that can be expected in a broader population.
  - It is typically estimated using sample data.
  - Effect size methods encompass a range of statistical tools used to quantify these effects.
  - The field of effect size measures is often referred to simply as "effect size" to capture its broad scope.

- Effect size statistical methods can be broadly categorized into two groups:

  - **Association:** These methods quantify the strength of the relationship between variables, such as correlation.

  - **Difference:** These methods measure the magnitude of differences between variables, like the difference between means.

- An effect size can represent the outcome of a treatment compared to other groups (e.g., treated vs. untreated), or it can describe the degree of association between related variables (e.g., the relationship between treatment dosage and health).

- The interpretation of an effect size calculation depends on the specific statistical method used, and the choice of measure is guided by the goals of the analysis. There are three common types of calculated results:

  - **Standardized Result:** The effect size is expressed on a standardized scale, making it interpretable across various applications (e.g., Cohen's d calculation).

  - **Original Units Result:** The effect size is presented in the original units of the variable, aiding in interpretation within the specific domain (e.g., the difference between two sample means).

  - **Unit-Free Result:** The effect size lacks units, such as counts or proportions (e.g., a correlation coefficient).

- Effect size can encompass the raw difference between group means (absolute effect size) or standardized measures that transform the effect into a more easily understandable scale.
  - Absolute effect size is particularly useful when the variables being studied have intrinsic meaning, like the number of hours of sleep.

- It's often advisable to report an effect size using multiple measures to cater to different types of readers of your findings.
  - Sometimes, results are best reported in original units for ease of understanding and in standardized measures for inclusion in future meta-analyses.

- Importantly, the concept of effect size does not replace the results of a statistical hypothesis test; it complements it. Ideally, both the results of the hypothesis test and the effect size calculation should be presented side by side:

  - **Hypothesis Test:** Quantifies the likelihood of observing the data based on a specific assumption (null hypothesis).

  - **Effect Size:** Quantifies the size of the effect, assuming that the effect exists.

- Effect size measures help provide a more complete and nuanced understanding of the outcomes and the practical significance of your research findings.

---



# Calculating Effect Size

- Effect size can be calculated in various ways, depending on the nature of your data and research questions.
  - It could be as simple as calculating the mean of a sample or quantifying the absolute difference between two means.
  - More complex statistical methods might also be used.

- In this section, we'll explore some common methods for calculating effect size, both for associations and differences.

- It's important to note that the examples provided here are not exhaustive, as there are numerous methods available for calculating effect sizes, tailored to specific research contexts and objectives.




## Calculate Association Effect Size

- When we want to quantify the relationship between variables, we use what's known as the "r family" of effect size methods.
  - This includes the commonly used Pearson's correlation coefficient, often called Pearson's r.
  - This measure helps us understand the strength and direction of a linear relationship between two real-valued variables.

- Pearson's correlation coefficient provides a unit-free effect size, which can be interpreted as follows:

  - **-1.0:** Perfect negative relationship
  - **-0.7:** Strong negative relationship
  - **-0.5:** Moderate negative relationship
  - **-0.3:** Weak negative relationship
  - **0.0:** No relationship
  - **0.3:** Weak positive relationship
  - **0.5:** Moderate positive relationship
  - **0.7:** Strong positive relationship
  - **1.0:** Perfect positive relationship

- To calculate the Pearson's correlation coefficient in Python, you can use the `pearsonr()` function from SciPy.
  - Here's an example demonstrating the calculation of the effect size, showing the strong positive relationship between two sets of random Gaussian numbers:

```python
# Calculate Pearson's correlation between two variables
from numpy.random import randn
from numpy.random import seed
from scipy.stats import pearsonr

# Seed the random number generator
seed(1)

# Prepare data
data1 = 10 * randn(10000) + 50
data2 = data1 + (10 * randn(10000) + 50)

# Calculate Pearson's correlation
corr, _ = pearsonr(data1, data2)
print('Pearson\'s correlation: %.3f' % corr)
```

- Running this code calculates and prints the Pearson's correlation coefficient, revealing the strong positive relationship between the two datasets.

  - **Output:**
  ```
  Pearson's correlation: 0.712
  ```

- Another commonly used method for calculating association effect size is the r-squared measure (r2), also known as the coefficient of determination.
  - It helps summarize the proportion of variance in one variable that is explained by another.


---


## Calculate Difference Effect Size

- When we want to measure the difference between groups, we often use the "d family" of effect size methods, with Cohen's d being a widely used measure.
  - Cohen's d helps us quantify the difference between the means of two sets of data that follow a Gaussian distribution.
  - It provides a standardized score that allows us to interpret the effect size.

- Cohen's d effect sizes can be summarized as follows:
  - **Small Effect Size:** d = 0.20
  - **Medium Effect Size:** d = 0.50
  - **Large Effect Size:** d = 0.80

- To calculate Cohen's d manually in Python, you can use the following formula:

  ```markdown
  d = (μ1 - μ2) / s
  ```

  Where:
  - d: Cohen's d (effect size)
  - μ1: Mean of the first sample
  - μ2: Mean of the second sample
  - s: Pooled standard deviation of both samples

- The pooled standard deviation (s) for two independent samples can be calculated as follows:

    ```markdown
    s = sqrt(((n1 - 1) * s1 + (n2 - 1) * s2) / (n1 + n2 - 2))
    ```

    Where:
    - s: Pooled standard deviation
    - n1, n2: Sample sizes of the first and second samples
    - s1, s2: Variance of the first and second samples

- Here's a Python function for calculating Cohen's d:

```python
# Function to calculate Cohen's d for independent samples
def cohend(d1, d2):
    n1, n2 = len(d1), len(d2)
    s1, s2 = var(d1, ddof=1), var(d2, ddof=1)
    s = sqrt(((n1 - 1) * s1 + (n2 - 1) * s2) / (n1 + n2 - 2))
    u1, u2 = mean(d1), mean(d2)
    return (u1 - u2) / s
```

- Here's an example of how to use this function to calculate Cohen's d for two samples of random Gaussian variables:

```python
# Seed the random number generator
seed(1)

# Prepare data
data1 = 10 * randn(10000) + 60
data2 = 10 * randn(10000) + 55

# Calculate Cohen's d
d = cohend(data1, data2)
print('Cohen\'s d: %.3f' % d)
```

- Running this code calculates and prints the Cohen's d effect size. In this example, the difference between the means represents a medium effect size.

  - **Output:**
    ```
    Cohen's d: 0.500
    ```

- Other popular methods for quantifying difference effect size include the Odds Ratio and Relative Risk Ratio.

---

## Further Reading

### Papers

- **Using Effect Size—or Why the P Value Is Not Enough, 2012**  - [Read Paper](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3444174/)

- **Things I have learned (so far), 1990** - [Read Paper](https://tech.me.holycross.edu/files/2015/03/Cohen_1990.pdf)

### Books

- **The Essential Guide to Effect Sizes: Statistical Power, Meta-Analysis, and the Interpretation of Research Results, 2010** - [Amazon](https://amzn.to/2JDcwSe)

- **Understanding The New Statistics: Effect Sizes, Confidence Intervals, and Meta-Analysis, 2011** - [Amazon](https://amzn.to/2v0wKSI)

- **Statistical Power Analysis for the Behavioral Sciences, 1988** - [Amazon](https://amzn.to/2GNcmtu)

### APIs

- **scipy.stats.pearsonr API** - [Documentation](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.pearsonr.html)

- **numpy.var API** - [Documentation](https://docs.scipy.org/doc/numpy/reference/generated/numpy.var.html)

- **numpy.mean API** - [Documentation](https://docs.scipy.org/doc/numpy/reference/generated/numpy.mean.html)

### Articles

- **Effect size on Wikipedia** - [Read on Wikipedia](https://en.wikipedia.org/wiki/Effect_size)


---


