# Lesson: Correlation Analysis in Statistical Studies

## Objectives
By the end of this lesson, participants will:
- Understand the concept of correlation and its significance.
- Learn the types of correlation coefficients and their applications.
- Perform correlation analysis in R using built-in functions.
- Interpret correlation results and visualize them.

---

## 1. What is Correlation?
Correlation measures the strength and direction of the linear relationship between two variables.

### Key Characteristics
- **Direction**: Indicates whether the relationship is positive or negative.
- **Strength**: Ranges from -1 to +1:
  - **+1**: Perfect positive correlation.
  - **-1**: Perfect negative correlation.
  - **0**: No correlation.

### Types of Correlation
1. **Positive Correlation**: As one variable increases, the other increases.
2. **Negative Correlation**: As one variable increases, the other decreases.
3. **No Correlation**: No predictable relationship exists.

---

## 2. Correlation Coefficients

### Pearson Correlation
- Measures the linear relationship between two continuous variables.
- Assumes variables are normally distributed.

### Spearman Correlation
- Measures the monotonic relationship between two variables.
- Does not assume normality; uses ranks instead of raw values.

### Kendall Tau Correlation
- Measures the strength of association based on concordant and discordant pairs.
- Suitable for small datasets and ordinal data.

---

## 3. Performing Correlation Analysis in R

### Example Dataset
We will use the built-in `mtcars` dataset.

```r
# Load Dataset
data(mtcars)
head(mtcars)
```

### 3.1 Pearson Correlation
```r
# Pearson Correlation
cor(mtcars$mpg, mtcars$wt, method = "pearson")
```

### 3.2 Spearman Correlation
```r
# Spearman Correlation
cor(mtcars$mpg, mtcars$wt, method = "spearman")
```

### 3.3 Kendall Tau Correlation
```r
# Kendall Tau Correlation
cor(mtcars$mpg, mtcars$wt, method = "kendall")
```

### Correlation Matrix
A correlation matrix shows pairwise correlations between multiple variables.

```r
# Correlation Matrix
cor_matrix <- cor(mtcars[, c("mpg", "wt", "hp", "disp")], method = "pearson")
print(cor_matrix)
```

---

## 4. Visualizing Correlation

### Scatter Plot
Scatter plots help visualize relationships between two variables.

```r
# Scatter Plot
plot(mtcars$wt, mtcars$mpg, main = "MPG vs Weight", xlab = "Weight", ylab = "MPG", pch = 19)
```

### Correlation Heatmap
Heatmaps provide an intuitive way to visualize a correlation matrix.

```r
# Install and Load ggplot2 and reshape2
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("reshape2")) install.packages("reshape2")
library(ggplot2)
library(reshape2)

# Prepare Data for Heatmap
cor_data <- melt(cor_matrix)

# Heatmap
ggplot(cor_data, aes(Var1, Var2, fill = value)) +
  geom_tile() +
  scale_fill_gradient2(low = "blue", high = "red", mid = "white", midpoint = 0) +
  labs(title = "Correlation Heatmap", x = "", y = "") +
  theme_minimal()
```

---

## 5. Interpretation of Results
1. **Strength**:
   - Close to +1 or -1: Strong correlation.
   - Close to 0: Weak or no correlation.
2. **Direction**:
   - Positive value: Positive correlation.
   - Negative value: Negative correlation.
3. **Statistical Significance**:
   - Use `cor.test()` to calculate p-values and assess significance.

```r
# Correlation Test
cor_test <- cor.test(mtcars$mpg, mtcars$wt, method = "pearson")
print(cor_test)
```

---

## 6. Exercise

### Task
1. Load a dataset of your choice.
2. Perform the following:
   - Calculate Pearson, Spearman, and Kendall correlations.
   - Generate a correlation matrix for at least three variables.
   - Create a scatter plot and annotate the correlation coefficient.
   - Create a heatmap of the correlation matrix.
3. Write a short interpretation of your findings.

---

## Summary
In this lesson, we:
- Defined correlation and explored its types.
- Learned to calculate Pearson, Spearman, and Kendall correlation coefficients.
- Visualized relationships using scatter plots and heatmaps.
- Interpreted the strength and significance of correlations.

