# Correlogram

- **Type**: **Correlation**
- **Purpose**: A correlogram is used to visualize the **correlation matrix** of variables, helping to identify relationships and patterns between multiple variables. It shows the strength and direction of the correlation between pairs of variables.

- **How It Works**:
  - Each cell in the correlogram represents the **correlation coefficient** between two variables, which ranges from **-1** (perfect negative correlation) to **+1** (perfect positive correlation).
  - A **color gradient** is used to indicate the strength of the correlation (often red for negative, blue for positive, or similar).
  - The **diagonal** represents self-correlation (which is always 1).

- **Common Use Cases**:
  - Understanding relationships in **multivariate data**.
  - Analyzing how variables like **age, income, and education** are related.
  - Identifying **highly correlated variables** to avoid multicollinearity in regression models.

## Customization Parameters

### **Seaborn Customization**

- **`cmap`**: Specifies the colormap for the heatmap.
- **`annot`**: Adds correlation coefficients to each cell as text.
- **`mask`**: Masks part of the heatmap (often used to mask the upper triangle to avoid redundancy).
- **`center`**: Defines the midpoint of the colormap (typically `0` for correlation matrices).
- **`linewidths`**: Sets the width of the lines between the cells.



In [8]:
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import datasets
import seaborn as sns
import numpy as np

In [9]:
iris = datasets.load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df["type"] = pd.DataFrame(data=iris.target)
# Define a function to map the values
def map_flower_type(type_value: int):
    if type_value == 0: return 'setosa'
    if type_value == 1: return 'versicolor'
    if type_value == 2: return 'virginica'
    else: return 'Unknown'

df['flower'] = df['type'].apply(map_flower_type)

In [None]:
corr_matrix = df.corr(numeric_only=True)
mask = np.triu(np.ones_like(corr_matrix, dtype=bool), k=1)
sns.heatmap(
    corr_matrix,
    annot=True,
    cmap="coolwarm",
    mask=mask,
    linewidths=0.7,
    center=0,
)
plt.title("Correlation Matrix")
plt.show()