# Daily Blog #53 - Tools & Techniques for Correlation and Relationship Analysis
### June 22, 2025

In this blog, we will explore *how to examine relationships between variables*. 

### Correlation Matrix & Heatmap 
When working with multiple numeric variables, you want a quick overview of which variables move together — this is where a **correlation matrix** comes in.

#### Technicals:
* `df.corr()` — computes pairwise Pearson correlations for all numeric columns.
* `sns.heatmap()` — visualizes the resulting correlations in a color-coded grid.

#### Learning:
* Quickly identify **which variables have strong positive or negative relationships**.
* Save time by deciding which pairs to explore further.

---

### Bivariate Plots with Regression Lines 
Numbers alone can hide patterns or nonlinear behavior. Plotting lets you see what’s going on.

#### Technicals:
* `sns.lmplot()` — scatter plot with optional regression fit, allowing you to check **linearity and trends**.
* Adjust parameters (`height`, `aspect`, `scatter_kws`, `line_kws`) to improve clarity and highlight relationships.

#### Learning:
* Visually confirm whether a **linear model** is a good fit.
* Spot **outliers**, **clusters**, or unexpected shapes that might require further data cleaning or different models.

---

### Statistical Tests for Correlation 
Plotting is powerful but subjective. Statistical tests give you a **formal measure** of strength and significance.

#### Technicals:
* `scipy.stats.pearsonr()` — tests for **linear correlation** and returns both the **correlation coefficient (r)** and the **p-value**.
* `scipy.stats.spearmanr()` — tests for **monotonic relationships** (not necessarily linear) using ranked data.

#### Learning:
* **r-value** tells you *how strong and in what direction* the relationship is.
* **p-value** tells you whether this relationship is **unlikely due to random chance** (typically significant if `p < 0.05`).
* Use Spearman if data is **not normally distributed** or the relationship looks nonlinear.

---

### Summary
By combining:
* **Heatmaps** for quick overview,
* **Scatter plots** for visual insight,
* **Pearson/Spearman tests** for rigorous confirmation,
you’ll have a **complete toolkit** for understanding relationships between any two variables.