<h1 align="center">Covariance and Correlation</h1>

---

### **Definition**
Covariance and correlation are **statistical measures** that describe the **relationship between two variables**.  
They help determine **how changes in one variable are associated with changes in another**.

---

### **Covariance**

**Definition:**  
Covariance measures **how much two random variables change together**.  

- If both variables **increase or decrease together**, covariance is **positive**.  
- If one increases while the other decreases, covariance is **negative**.  
- Covariance **does not have a fixed range** — its value depends on the **scale** of the data.

---

### **Formula**

\[
\text{Cov}(X, Y) = \frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{n - 1}
\]

---

### **Example**

| X (Hours Studied) | Y (Exam Score) |
|:------------------:|:---------------:|
| 2 | 40 |
| 4 | 60 |
| 6 | 70 |
| 8 | 90 |
| 10 | 85 |

**Interpretation:**
- When study hours increase, exam scores also tend to increase.  
- Hence, the **covariance is positive**, showing a **direct relationship**.

---

### **Interpretation**
- **Positive Covariance:** Variables move in the **same direction**.  
- **Negative Covariance:** Variables move in **opposite directions**.  
- **Zero Covariance:** No linear relationship between variables.

---

### **Correlation Coefficient**

**Definition:**  
Correlation is a **standardized form of covariance** that measures the **strength and direction** of the linear relationship between two variables.

---

### **Formula**

$
r = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}
$

where  
- \( r \): Pearson correlation coefficient  
- \( \sigma_X, \sigma_Y \): Standard deviations of X and Y  

---

### **Range of Correlation (r)**

| Value of r | Meaning |
|:------------:|:------------------------|
| +1 | Perfect Positive Linear Relationship |
| 0 | No Linear Relationship |
| -1 | Perfect Negative Linear Relationship |

---

### **Types of Correlation**

1. **Positive Correlation** → Both variables increase together.  
2. **Negative Correlation** → One variable increases, the other decreases.  
3. **Zero Correlation** → No relationship.

---

### **Real-World Examples**

- **House Size vs Price** → Larger houses usually cost more → **Positive correlation**  
- **Hours Studied vs Marks** → More study hours = higher marks → **Positive correlation**  
- **Temperature vs Heater Usage** → Higher temperature = less heater use → **Negative correlation**

---

### **Advantages**
- Helps identify **relationships** between variables.  
- Useful for **feature selection** in machine learning.  
- Enables **prediction and trend analysis**.

---

### **Limitations**
- Covariance has **no fixed scale**, making comparison difficult.  
- Correlation assumes **linear relationships** only.  
- Doesn’t imply **causation** (i.e., one variable doesn’t necessarily cause the other).

---




In [1]:
import numpy as np
import pandas as pd

# Example data
x = [2, 4, 6, 8, 10]
y = [40, 60, 70, 90, 85]

# Covariance
cov = np.cov(x, y, bias=False)[0][1]

# Correlation
corr = np.corrcoef(x, y)[0][1]

print("Covariance:", cov)
print("Correlation:", corr)

Covariance: 60.0
Correlation: 0.9428090415820634
