# Pearson Correlation Analysis â€“ Interactive Exercises

This notebook contains **guided, hands-on exercises** to help you understand and apply **Pearson correlation analysis**.

You will:
- Explore assumptions behind Pearson correlation
- Visualize linear relationships
- Compute Pearson correlation using **SciPy** and **Pandas**
- Interpret correlation coefficients correctly

ðŸ‘‰ Try to complete each exercise **before expanding the hint**.

---
## Reminder
**Correlation does not imply causation.** Pearson correlation only measures **linear association** between variables.

## Exercise 0 â€“ Import Required Libraries

**Task:** Import all required libraries for data handling, visualization, and correlation analysis.

<details>
<summary>ðŸ’¡ Hint</summary>

You will need:
- pandas, numpy
- matplotlib, seaborn
- pearsonr from scipy.stats
</details>

In [None]:
# YOUR CODE HERE

## Exercise 1 â€“ Load and Inspect the Dataset

The **mtcars** dataset contains numeric variables describing car performance.

**Task:**
1. Load the CSV file
2. Assign proper column names
3. Display the first few rows

<details>
<summary>ðŸ’¡ Hint</summary>

Use `pd.read_csv()` and `.head()`.

Column names are provided in the lesson transcript.
</details>

In [None]:
# YOUR CODE HERE

## Exercise 2 â€“ Visualize Relationships Using Pair Plots

Pearson correlation assumes **linear relationships**.

**Task:** Create a **pair plot** for the following variables:
- mpg
- hp
- qsec
- wt

<details>
<summary>ðŸ’¡ Hint</summary>

Select a subset of columns and pass it to `sns.pairplot()`.
</details>

In [None]:
# YOUR CODE HERE

## Exercise 3 â€“ Calculate Pearson Correlation Using SciPy

**Task:**
1. Extract `mpg` and `hp`
2. Compute the Pearson correlation coefficient
3. Print the result formatted to 3 decimal places

<details>
<summary>ðŸ’¡ Hint</summary>

Use `pearsonr(variable1, variable2)`.
</details>

In [None]:
# YOUR CODE HERE

## Exercise 4 â€“ Compare MPG with Other Variables

**Task:** Compute Pearson correlation between `mpg` and:
- qsec
- wt

**Question:** Which variable has the strongest relationship with `mpg`?

<details>
<summary>ðŸ’¡ Hint</summary>

Repeat the `pearsonr()` calculation for each variable.
</details>

In [None]:
# YOUR CODE HERE

## Exercise 5 â€“ Pearson Correlation Matrix Using Pandas

**Task:**
1. Create a DataFrame containing mpg, hp, qsec, wt
2. Compute the correlation matrix

<details>
<summary>ðŸ’¡ Hint</summary>

Use the `.corr()` method.
</details>

In [None]:
# YOUR CODE HERE

## Exercise 6 â€“ Visualize Correlation with a Heatmap

**Task:** Visualize the correlation matrix using a seaborn heatmap.

<details>
<summary>ðŸ’¡ Hint</summary>

Use `sns.heatmap()` and enable annotations.
</details>

In [None]:
# YOUR CODE HERE

---
# âœ… Collapsed Full Solution (Self-Check)

<details>
<summary>ðŸ“˜ Click to expand full solution</summary>

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import pearsonr

%matplotlib inline
sns.set_style("whitegrid")

address = '/workspaces/python-for-data-science-and-machine-learning-essential-training-part-1-3006708/data/mtcars.csv'
cars = pd.read_csv(address)
cars.columns = ['car_names','mpg','cyl','disp', 'hp', 'drat', 'wt', 'qsec', 'vs', 'am', 'gear', 'carb']

x = cars[['mpg','hp','qsec','wt']]
sns.pairplot(x)

mpg = cars['mpg']
hp = cars['hp']
qsec = cars['qsec']
wt = cars['wt']

print(pearsonr(mpg, hp))
print(pearsonr(mpg, qsec))
print(pearsonr(mpg, wt))

corr = x.corr()
sns.heatmap(corr, annot=True)
```

</details>