# Session 43: Multivariate Visualization

**Unit 4: Descriptive Statistics and Visualization**
**Hour: 43**
**Mode: Practical Lab**

---

### 1. Objective

This lab elevates our visualization skills by introducing a third variable into our plots. This is called **multivariate analysis**. We will learn how to use visual properties like `hue` (color) and `style` to encode more information into a single chart, allowing us to uncover deeper, more complex relationships.

We will focus on adding a categorical third variable to our bivariate plots.

### 2. Setup

Import our standard libraries and load the clean Telco dataset.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

url = 'https://raw.githubusercontent.com/IBM/telco-customer-churn-on-icp4d/master/data/Telco-Customer-Churn.csv'
df = pd.read_csv(url)
df['TotalCharges'] = pd.to_numeric(df['TotalCharges'], errors='coerce')
df['TotalCharges'].fillna(df['TotalCharges'].median(), inplace=True)

### 3. Adding a Third Variable (`hue`)

The `hue` parameter in Seaborn is the most common way to add a third, categorical dimension to a plot. It will color the data points or bars based on the categories in the specified column.

#### 3.1. Hue on a Scatter Plot

**Business Question:** We already know there's a relationship between `tenure` and `TotalCharges`. But does this relationship differ depending on the `Contract` type?

In [None]:
plt.figure(figsize=(12, 7))
sns.scatterplot(x=df['tenure'], y=df['TotalCharges'], hue=df['Contract'])
plt.title('Tenure vs. Total Charges, Colored by Contract Type')
plt.xlabel('Tenure (months)')
plt.ylabel('Total Charges')
plt.show()

**Interpretation:** This is incredibly insightful!
*   The `Month-to-month` customers (blue) are spread all over, but many have low tenure.
*   The `One year` customers (orange) form a steeper line in the middle.
*   The `Two year` customers (green) form a very clear, steep line, showing that long-term customers on this contract consistently accumulate high total charges. This visual clearly separates the customer segments.

#### 3.2. Hue on a Bar Plot

**Business Question:** We know that customers with Fiber optic have higher average monthly charges. But within each internet service type, who pays more: those with dependents or those without?

In [None]:
plt.figure(figsize=(10, 6))
sns.barplot(x=df['InternetService'], y=df['MonthlyCharges'], hue=df['Dependents'])
plt.title('Avg. Monthly Charges by Internet Service and Dependents')
plt.xlabel('Internet Service')
plt.ylabel('Average Monthly Charges')
plt.show()

**Interpretation:** In both DSL and Fiber optic categories, customers **without** dependents tend to have slightly higher average monthly charges than those with dependents.

### 4. Advanced Grouped Visualization with `catplot`

Seaborn's `catplot` (Categorical Plot) is a powerful function that lets you create faceted plots. You can create subplots for each category of a variable.

**Business Question:** We know churners have lower tenure. But is this pattern consistent across different contract types?

In [None]:
# Use 'col' to create columns of subplots for each Contract type
sns.catplot(x='Churn', y='tenure', col='Contract', data=df, kind='box')
plt.show()

**Interpretation:** This gives us a much deeper understanding. The difference in tenure between churners and non-churners is **most dramatic** for customers on One year and Two year contracts. For Month-to-month customers, the tenure for both groups is already low, but the difference is still very clear.

### 5. Conclusion

In this lab, you learned to add a third dimension to your visualizations to uncover more nuanced insights:
1.  Use the `hue` parameter to color data points based on a categorical variable.
2.  Apply `hue` to both scatter plots and bar plots to segment your analysis.
3.  Use `catplot` to create faceted plots, which are excellent for comparing distributions across multiple categories at once.

These techniques move you from simple reporting to true exploratory data analysis.

**Next Session:** We will learn how to combine multiple plots and use annotations to tell a compelling story about our findings.