# Correlation and Covariance

**Correlation** and **Covariance** are statistical measures that describe the relationship between two variables.

## Covariance
Covariance measures the direction of the linear relationship between two variables.
$$Cov(X, Y) = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{n-1}$$

*   **Positive Covariance:** As $X$ increases, $Y$ tends to increase.
*   **Negative Covariance:** As $X$ increases, $Y$ tends to decrease.
*   **Zero Covariance:** No linear relationship.

**Problem:** The magnitude depends on the units of the variables, making it hard to interpret.

## Correlation (Pearson's Correlation Coefficient)
Correlation is the standardized version of covariance, ranging from -1 to +1.
$$r = \frac{Cov(X, Y)}{\sigma_X \sigma_Y}$$

*   **$r = +1$:** Perfect positive linear relationship.
*   **$r = -1$:** Perfect negative linear relationship.
*   **$r = 0$:** No linear relationship.
*   **$0 < |r| < 0.3$:** Weak correlation.
*   **$0.3 \le |r| < 0.7$:** Moderate correlation.
*   **$|r| \ge 0.7$:** Strong correlation.

## Key Difference
*   **Covariance:** Direction of relationship (unit-dependent).
*   **Correlation:** Direction **and** strength (unit-free, standardized).

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Generate sample data
np.random.seed(42)
x = np.arange(0, 50)
y_positive = x + np.random.normal(0, 5, 50)  # Positive correlation
y_negative = -x + np.random.normal(0, 5, 50)  # Negative correlation
y_no_corr = np.random.normal(25, 10, 50)  # No correlation

# Calculate Covariance and Correlation
cov_pos = np.cov(x, y_positive)[0, 1]
corr_pos = np.corrcoef(x, y_positive)[0, 1]

cov_neg = np.cov(x, y_negative)[0, 1]
corr_neg = np.corrcoef(x, y_negative)[0, 1]

cov_no = np.cov(x, y_no_corr)[0, 1]
corr_no = np.corrcoef(x, y_no_corr)[0, 1]

# Visualization
plt.figure(figsize=(15, 5))

plt.subplot(1, 3, 1)
plt.scatter(x, y_positive, alpha=0.6)
plt.title(f"Positive Correlation\nCorr={corr_pos:.2f}, Cov={cov_pos:.2f}")
plt.xlabel("X")
plt.ylabel("Y")
plt.grid(True, alpha=0.3)

plt.subplot(1, 3, 2)
plt.scatter(x, y_negative, alpha=0.6, color='red')
plt.title(f"Negative Correlation\nCorr={corr_neg:.2f}, Cov={cov_neg:.2f}")
plt.xlabel("X")
plt.ylabel("Y")
plt.grid(True, alpha=0.3)

plt.subplot(1, 3, 3)
plt.scatter(x, y_no_corr, alpha=0.6, color='green')
plt.title(f"No Correlation\nCorr={corr_no:.2f}, Cov={cov_no:.2f}")
plt.xlabel("X")
plt.ylabel("Y")
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()