<a href="https://www.kaggle.com/code/hassaneskikri/effect-size-statistical-test?scriptVersionId=168277152" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
%%html
<style>

* {
    font-size: 18px;
    line-height: 1.5;
    font-family: 'Arial', sans-serif;
    align-item : center;
    justifiy-content:center;
    max-width : 1000px;

}

img{
    display: flex;
    margin-left: auto;
    margin-right: auto;
    width: 700px;
    height: auto;
    text-align: center;
    border-radius: 15px;
}
    
h1,
h2 {
  font-family: Impact, Charcoal, sans-serif;
  font-weight: bold;
  text-shadow: 2px 2px 4px #000000;
}

h1 {
  font-size: 52px;
  margin-bottom: 40px;
  color: #10929e;
  text-decoration: underline;
    text-align:center;
}

h2 {
  font-size: 448px;
  margin-bottom: 32px;
  color: #76b4be;
  text-transform: uppercase;
}

/* Block Quotes */
blockquote {
  font-family: Georgia, serif;
  font-size: 16px;
  color: #19085c;
  border-left: 8px solid #effffc;
  background-color: #eafcff;
  padding: 20px;
}
    
</style>

# Effect Size and Its Importance

Effect size is a statistical measure indicating the practical significance of research outcomes, helping to understand how meaningful the relationship between variables or the difference between groups truly is.

# Why Effect Size Matters

- **Statistical vs. Practical Significance**: Effect size complements p-values to provide a fuller picture of research findings, showing the real-world impact beyond mere statistical significance.
- **Independence from Sample Size**: Unlike p-values, effect sizes are not influenced by sample size, making them a more reliable indicator of practical significance.

# Calculating Effect Size

- **Common Measures**: The most frequently used effect size measures include Cohen’s d and Pearson’s r.
  - *Cohen’s d*: Expresses the mean difference between two groups in standard deviation units.
  - *Pearson’s r*: Measures the strength of the relationship between two variables on a standardized scale.

# Interpreting Effect Size

- **Cohen’s Criteria**: Effect sizes are categorized as small, medium, or large based on standardized benchmarks. Cohen's d and Pearson's r have different scales and criteria for these categories.

# Application of Effect Size

- **Before the Study**: Conducting a power analysis using an expected effect size can inform the necessary sample size for the study.
- **After the Study**: Reporting effect sizes in research papers aids in assessing the practical significance of findings and facilitates meta-analyses.


# Coefficient of Determination (R²)

## What is R²?
The coefficient of determination, or R², is a statistic that ranges from 0 to 1 and measures the extent to which a statistical model predicts an outcome.

## Interpretation
- `0`: The model does not predict the outcome at all.
- `Between 0 and 1`: The model has some predictive ability, with higher values indicating better prediction.
- `1`: The model predicts the outcome perfectly.

## Calculation
There are two main methods to calculate R²:
- **Using the correlation coefficient (r)**:
  - `R² = r²`
- **Using regression outputs**:
  - `R² = 1 - (RSS/TSS)`
    - `RSS`: Sum of squared residuals
    - `TSS`: Total sum of squares

## Importance of R²
- Indicates how much of the variability in the dependent variable can be explained by the model.
- Acts as a measure of how well unseen samples are likely to be predicted by the model.
- Provides a scale for comparing the explanatory power of models.

## Effect Size
R² can also be viewed as an effect size:
- `Small (0.01)`, `Medium (0.09)`, `Large (0.25)` according to Jacob Cohen's benchmarks.

## Reporting in Research
In APA style:
- Use `r²` for models with one predictor and `R²` for multiple.
- Italicize `r²` and `R²` but not the number `2`.
- Exclude the leading zero and provide two digits after the decimal point.

Remember, R² does not imply causation.

---


# implementation python 

## cohens_d

In [2]:
import numpy as np

def cohens_d(group1, group2):
    
    """Calculate Cohen's d for measuring effect size between two groups."""
    
    mean1, mean2 = np.mean(group1), np.mean(group2)
    sd1, sd2 = np.std(group1, ddof=1), np.std(group2, ddof=1)
    
    # Pooled standard deviation
    n1, n2 = len(group1), len(group2)
    pooled_sd = np.sqrt(((n1 - 1) * sd1 ** 2 + (n2 - 1) * sd2 ** 2) / (n1 + n2 - 2))
    
    # Calculate Cohen's d
    d = (mean1 - mean2) / pooled_sd
    return d


group1 = np.random.normal(100, 15, 30)
group2 = np.random.normal(105, 20, 30)

d = cohens_d(group1, group2)
print(f"Cohen's d: {d}")


Cohen's d: -0.4307847518678069


## pearsons_r

In [3]:
def pearsons_r(x, y):
    """Calculate Pearson's r correlation coefficient between two variables."""
    correlation_matrix = np.corrcoef(x, y)
    r = correlation_matrix[0, 1]
    return r


x = np.random.normal(10, 5, 100)
y = 2 * x + np.random.normal(0, 2, 100)


r = pearsons_r(x, y)
print(f"Pearson's r: {r}")


Pearson's r: 0.9756527061206507


## r_squared

In [4]:
def calculate_r_squared(x, y):
    """Calculate the coefficient of determination, R², using Pearson's r."""
    r = np.corrcoef(x, y)[0, 1]
    r_squared = r ** 2
    return r_squared


r_squared = calculate_r_squared(x, y)
print(f"R²: {r_squared}")


R²: 0.9518982029605488


**Cohen's d (-0.2588):** The negative value of Cohen's d suggests that group 1 has a slightly lower mean than group 2, with a small effect size according to Cohen's benchmarks. This indicates a small practical difference between the two groups.

**Pearson's r (0.9821):** A Pearson's r value close to 1 indicates a very strong positive correlation between the two variables x and y. This means as x increases, y also increases in a linear fashion, suggesting a strong linear relationship.

**R² (0.9645):** The R² value tells us that approximately 96.45% of the variance in y can be explained by x. This high R² value indicates a very good fit for the model to the data, meaning x is a strong predictor of y in this context.

# Resources

- [ouverview](https://www.scribbr.com/statistics/effect-size/)
- [coefficient of determination ](https://www.scribbr.com/statistics/coefficient-of-determination/)