# Session 68: Capstone Project Part 2 (Visualization)

**Unit 6: Data Ethics, Privacy, and Future Trends**
**Hour: 68**
**Mode: Practical Project**

---

### 1. Objective

This session is dedicated to visualizing the key findings from our EDA. The goal is to create a series of clear and compelling charts that tell the story of who our most responsive customers are. This will form the core of our final recommendation.

### 2. Setup

We'll need our data cleaning code, plus Matplotlib and Seaborn for plotting.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set a default style for our plots
sns.set_style("whitegrid")

# --- Start of Cleaning and Feature Engineering Code ---
url = 'https://raw.githubusercontent.com/LeoFernan/Marketing-Campaigns-Analysis/main/marketing_campaign.csv'
df = pd.read_csv(url, sep='\t')
df['Income'].fillna(df['Income'].median(), inplace=True)
df['Age'] = 2024 - df['Year_Birth']
df['Dt_Customer'] = pd.to_datetime(df['Dt_Customer'])
df['Customer_Lifetime_Days'] = (pd.to_datetime('2024-01-01') - df['Dt_Customer']).dt.days
df['Children'] = df['Kidhome'] + df['Teenhome']
spend_cols = ['MntWines', 'MntFruits', 'MntMeatProducts', 'MntFishProducts', 'MntSweetProducts', 'MntGoldProds']
df['Total_Spend'] = df[spend_cols].sum(axis=1)
cols_to_drop = ['Year_Birth', 'Dt_Customer', 'Marital_Status', 'Education', 'Kidhome', 'Teenhome', 'Z_CostContact', 'Z_Revenue']
df_clean = df.drop(columns=cols_to_drop)
# --- End of Cleaning and Feature Engineering Code ---

### 3. Visualizing Our Key Findings

Let's create a visualization for each of our main findings from the previous session.

#### Viz 1: Income Distribution for Responders vs. Non-Responders

A box plot is perfect for comparing these distributions.

In [None]:
plt.figure(figsize=(8, 6))
sns.boxplot(x='Response', y='Income', data=df_clean)
plt.title('Income Distribution by Campaign Response', fontsize=16)
plt.xticks([0, 1], ['Did Not Respond', 'Responded'])
plt.xlabel('Campaign Response')
plt.ylabel('Household Income')
plt.show()

**Story:** This plot clearly shows that the entire distribution of income for responders is higher than for non-responders. The median income for responders is significantly higher.

#### Viz 2: Total Spend for Responders vs. Non-Responders

Let's use a bar plot to show the average total spend.

In [None]:
plt.figure(figsize=(8, 6))
sns.barplot(x='Response', y='Total_Spend', data=df_clean)
plt.title('Average Total Spend by Campaign Response', fontsize=16)
plt.xticks([0, 1], ['Did Not Respond', 'Responded'])
plt.xlabel('Campaign Response')
plt.ylabel('Average Total Spend (Last 2 Years)')
plt.show()

**Story:** The difference is stark. Customers who responded to the campaign spend, on average, more than twice as much as those who didn't. This shows that the campaign is successfully attracting our most valuable customers.

#### Viz 3: Response Rate by Number of Children

A bar plot is also good here, but we need to show the *rate* of response, not the count.

In [None]:
plt.figure(figsize=(10, 6))
# By setting y='Response', the barplot will automatically calculate the mean of Response for each category
sns.barplot(x='Children', y='Response', data=df_clean)
plt.title('Campaign Response Rate by Number of Children in Household', fontsize=16)
plt.xlabel('Number of Children')
plt.ylabel('Response Rate')
plt.show()

**Story:** This plot confirms our finding that customers with zero children have a much higher response rate than those with one or more.

### 4. Conclusion

In this session, we translated our statistical findings from the previous lab into clear and compelling visualizations. We have learned to:
1.  Choose the right chart type (box plot, bar plot) to answer a specific business question.
2.  Use titles and labels to make the charts easy to understand.
3.  Interpret the story that each chart is telling.

These visualizations will be the core components of our final presentation to the marketing team.

**Next Session:** As a final step, we will build a simple predictive model to see if a machine learning algorithm can confirm the importance of our identified features.