# Quality Quotes AI - Data Visualizations

## Overview
This notebook creates comprehensive visualizations to analyze user engagement patterns with Quality Quotes AI.

### Visualization Strategy
- **Interactive charts** using Plotly for stakeholder presentations
- **Statistical plots** using Seaborn for detailed analysis
- **Multiple perspectives**: Temporal, categorical, behavioral

### Key Questions Addressed
1. When do users engage with the platform?
2. Which quote categories are most popular?
3. How do users interact with quotes?
4. What is the relationship between time spent and engagement?
5. What patterns exist across user segments?

In [10]:
# Import libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import warnings
warnings.filterwarnings('ignore')

# Styling
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 10

# Color palette for Quality Quotes AI brand
BRAND_COLORS = {
    'primary': '#6C63FF',    # Purple
    'secondary': '#FF6B9D',  # Pink
    'accent': '#FEC859',     # Yellow
    'success': '#4CAF50',    # Green
    'info': '#2196F3',       # Blue
    'dark': '#2D3748',       # Dark gray
    'light': '#F7FAFC'       # Light gray
}

CATEGORY_COLORS = {
    'Motivation': '#FF6B9D',
    'Inspiration': '#6C63FF',
    'Wellness': '#4CAF50',
    'Success': '#FEC859',
    'Mindfulness': '#9C27B0',
    'Gratitude': '#FF9800',
    'Growth': '#2196F3'
}

print("Libraries imported successfully!")

Libraries imported successfully!


## 1. Load Data

In [11]:
# Load datasets
users_df = pd.read_csv('data/users.csv', parse_dates=['signup_date'])
quotes_df = pd.read_csv('data/quotes.csv', parse_dates=['timestamp'])
interactions_df = pd.read_csv('data/interactions.csv', parse_dates=['timestamp'])
sessions_df = pd.read_csv('data/sessions.csv', parse_dates=['start_time'])

print("Data loaded successfully!")
print(f"\nDataset shapes:")
print(f"  Users: {users_df.shape}")
print(f"  Quotes: {quotes_df.shape}")
print(f"  Interactions: {interactions_df.shape}")
print(f"  Sessions: {sessions_df.shape}")

Data loaded successfully!

Dataset shapes:
  Users: (5000, 7)
  Quotes: (97859, 5)
  Interactions: (199270, 5)
  Sessions: (56766, 6)


## 2. Data Preparation

Create derived features for analysis.

In [12]:
# Add temporal features to quotes
quotes_df['date'] = quotes_df['timestamp'].dt.date
quotes_df['hour'] = quotes_df['timestamp'].dt.hour
quotes_df['day_of_week'] = quotes_df['timestamp'].dt.day_name()
quotes_df['week'] = quotes_df['timestamp'].dt.isocalendar().week
quotes_df['is_weekend'] = quotes_df['timestamp'].dt.dayofweek >= 5

# Add temporal features to interactions
interactions_df['date'] = interactions_df['timestamp'].dt.date
interactions_df['hour'] = interactions_df['timestamp'].dt.hour

# Add temporal features to sessions
sessions_df['date'] = sessions_df['start_time'].dt.date
sessions_df['hour'] = sessions_df['start_time'].dt.hour
sessions_df['day_of_week'] = sessions_df['start_time'].dt.day_name()

# Merge quotes with user segments
quotes_enriched = quotes_df.merge(users_df[['user_id', 'segment']], on='user_id', how='left')

# Calculate engagement metrics per quote
engagement_per_quote = interactions_df[interactions_df['interaction_type'] != 'view'].groupby('quote_id').size().reset_index(name='engagement_count')
quotes_enriched = quotes_enriched.merge(engagement_per_quote, on='quote_id', how='left')
quotes_enriched['engagement_count'] = quotes_enriched['engagement_count'].fillna(0)
quotes_enriched['has_engagement'] = quotes_enriched['engagement_count'] > 0

print("Data preparation complete!")
print(f"\nEnriched quotes dataset: {quotes_enriched.shape}")
print(f"Quotes with engagement: {quotes_enriched['has_engagement'].sum():,} ({quotes_enriched['has_engagement'].mean()*100:.1f}%)")

Data preparation complete!

Enriched quotes dataset: (97859, 13)
Quotes with engagement: 55,598 (56.8%)


## 3. Visualization 1: Quote Generation Over Time

**Insight Focus**: Understanding temporal patterns in quote generation.

In [13]:
# Daily quote generation trend
daily_quotes = quotes_df.groupby('date').size().reset_index(name='quote_count')
daily_quotes['date'] = pd.to_datetime(daily_quotes['date'])
daily_quotes['day_name'] = daily_quotes['date'].dt.day_name()
daily_quotes['is_weekend'] = daily_quotes['date'].dt.dayofweek >= 5

# Create interactive line chart
fig = go.Figure()

fig.add_trace(go.Scatter(
    x=daily_quotes['date'],
    y=daily_quotes['quote_count'],
    mode='lines+markers',
    name='Daily Quotes',
    line=dict(color=BRAND_COLORS['primary'], width=2),
    marker=dict(size=4),
    hovertemplate='<b>%{x|%B %d, %Y}</b><br>Quotes: %{y:,}<extra></extra>'
))

# Add 7-day moving average
daily_quotes['ma_7'] = daily_quotes['quote_count'].rolling(window=7, center=True).mean()
fig.add_trace(go.Scatter(
    x=daily_quotes['date'],
    y=daily_quotes['ma_7'],
    mode='lines',
    name='7-Day Average',
    line=dict(color=BRAND_COLORS['secondary'], width=2, dash='dash'),
    hovertemplate='<b>%{x|%B %d, %Y}</b><br>7-Day Avg: %{y:.0f}<extra></extra>'
))

fig.update_layout(
    title='Quote Generation Frequency Over Time',
    xaxis_title='Date',
    yaxis_title='Number of Quotes Generated',
    hovermode='x unified',
    height=500,
    template='plotly_white'
)

fig.show()

print(f"\nKey Statistics:")
print(f"  Total quotes generated: {len(quotes_df):,}")
print(f"  Daily average: {daily_quotes['quote_count'].mean():.0f}")
print(f"  Peak day: {daily_quotes.loc[daily_quotes['quote_count'].idxmax(), 'date'].strftime('%B %d, %Y')} ({daily_quotes['quote_count'].max():,} quotes)")
print(f"  Weekend average: {daily_quotes[daily_quotes['is_weekend']]['quote_count'].mean():.0f}")
print(f"  Weekday average: {daily_quotes[~daily_quotes['is_weekend']]['quote_count'].mean():.0f}")


Key Statistics:
  Total quotes generated: 97,859
  Daily average: 1052
  Peak day: September 30, 2025 (11,770 quotes)
  Weekend average: 1072
  Weekday average: 1045


## 4. Visualization 2: Activity Heatmap (Hour × Day of Week)

**Insight Focus**: Identifying peak engagement times.

In [14]:
# Create pivot table for heatmap
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
heatmap_data = quotes_df.groupby(['day_of_week', 'hour']).size().reset_index(name='count')
heatmap_pivot = heatmap_data.pivot(index='day_of_week', columns='hour', values='count').fillna(0)
heatmap_pivot = heatmap_pivot.reindex(day_order)

# Create interactive heatmap
fig = go.Figure(data=go.Heatmap(
    z=heatmap_pivot.values,
    x=heatmap_pivot.columns,
    y=heatmap_pivot.index,
    colorscale='Viridis',
    hovertemplate='<b>%{y}</b><br>Hour: %{x}:00<br>Quotes: %{z:.0f}<extra></extra>',
    colorbar=dict(title='Quote Count')
))

fig.update_layout(
    title='Quote Generation Heatmap: Day of Week × Hour of Day',
    xaxis_title='Hour of Day',
    yaxis_title='Day of Week',
    height=500,
    template='plotly_white'
)

fig.show()

# Find peak hours
hourly_quotes = quotes_df.groupby('hour').size().sort_values(ascending=False)
print(f"\nTop 5 Peak Hours:")
for hour, count in hourly_quotes.head().items():
    print(f"  {hour:02d}:00 - {count:,} quotes ({count/len(quotes_df)*100:.1f}%)")


Top 5 Peak Hours:
  07:00 - 7,932 quotes (8.1%)
  20:00 - 7,800 quotes (8.0%)
  19:00 - 7,706 quotes (7.9%)
  08:00 - 7,652 quotes (7.8%)
  22:00 - 5,867 quotes (6.0%)


## 5. Visualization 3: Quote Category Popularity

**Insight Focus**: Which categories resonate most with users?

In [15]:
# Category statistics with engagement
category_stats = quotes_enriched.groupby('category').agg({
    'quote_id': 'count',
    'engagement_count': 'sum',
    'has_engagement': 'mean'
}).reset_index()

category_stats.columns = ['category', 'total_quotes', 'total_engagements', 'engagement_rate']
category_stats = category_stats.sort_values('total_quotes', ascending=True)

# Create horizontal bar chart with dual metrics
fig = make_subplots(
    rows=1, cols=2,
    subplot_titles=('Quote Volume by Category', 'Engagement Rate by Category'),
    specs=[[{'type': 'bar'}, {'type': 'bar'}]]
)

# Volume chart
fig.add_trace(
    go.Bar(
        y=category_stats['category'],
        x=category_stats['total_quotes'],
        orientation='h',
        marker=dict(color=[CATEGORY_COLORS[cat] for cat in category_stats['category']]),
        text=category_stats['total_quotes'],
        texttemplate='%{text:,}',
        textposition='auto',
        hovertemplate='<b>%{y}</b><br>Quotes: %{x:,}<extra></extra>'
    ),
    row=1, col=1
)

# Engagement rate chart
fig.add_trace(
    go.Bar(
        y=category_stats.sort_values('engagement_rate', ascending=True)['category'],
        x=category_stats.sort_values('engagement_rate', ascending=True)['engagement_rate'] * 100,
        orientation='h',
        marker=dict(color=[CATEGORY_COLORS[cat] for cat in category_stats.sort_values('engagement_rate', ascending=True)['category']]),
        text=category_stats.sort_values('engagement_rate', ascending=True)['engagement_rate'] * 100,
        texttemplate='%{text:.1f}%',
        textposition='auto',
        hovertemplate='<b>%{y}</b><br>Engagement Rate: %{x:.1f}%<extra></extra>'
    ),
    row=1, col=2
)

fig.update_xaxes(title_text='Number of Quotes', row=1, col=1)
fig.update_xaxes(title_text='Engagement Rate (%)', row=1, col=2)

fig.update_layout(
    title_text='Quote Category Analysis',
    showlegend=False,
    height=500,
    template='plotly_white'
)

fig.show()

print("\nCategory Performance Summary:")
category_stats_sorted = category_stats.sort_values('total_quotes', ascending=False)
for _, row in category_stats_sorted.iterrows():
    print(f"  {row['category']:15s}: {row['total_quotes']:5,} quotes | {row['engagement_rate']*100:5.1f}% engagement | {row['total_engagements']:6,.0f} total interactions")


Category Performance Summary:
  Motivation     : 24,478 quotes |  67.0% engagement | 29,878 total interactions
  Inspiration    : 19,603 quotes |  61.2% engagement | 21,865 total interactions
  Success        : 14,693 quotes |  56.2% engagement | 15,085 total interactions
  Wellness       : 14,533 quotes |  52.3% engagement | 13,998 total interactions
  Mindfulness    : 9,896 quotes |  46.7% engagement |  8,363 total interactions
  Gratitude      : 7,822 quotes |  43.0% engagement |  6,143 total interactions
  Growth         : 6,834 quotes |  49.2% engagement |  6,079 total interactions


## 6. Visualization 4: User Interaction Funnel

**Insight Focus**: How do users progress through interaction types?

In [16]:
# Calculate interaction funnel
interaction_counts = interactions_df['interaction_type'].value_counts()
interaction_order = ['view', 'like', 'save', 'copy', 'share']
funnel_data = [interaction_counts.get(itype, 0) for itype in interaction_order]

# Calculate conversion rates
views = interaction_counts['view']
conversion_rates = [(count/views)*100 for count in funnel_data]

# Create funnel chart
fig = go.Figure()

fig.add_trace(go.Funnel(
    name='User Interaction Funnel',
    y=interaction_order,
    x=funnel_data,
    textinfo='value+percent initial',
    marker=dict(
        color=[BRAND_COLORS['primary'], BRAND_COLORS['info'], 
               BRAND_COLORS['success'], BRAND_COLORS['accent'], BRAND_COLORS['secondary']]
    ),
    hovertemplate='<b>%{label}</b><br>Count: %{value:,}<br>% of Views: %{percentInitial}<extra></extra>'
))

fig.update_layout(
    title='User Interaction Funnel',
    height=500,
    template='plotly_white'
)

fig.show()

print("\nInteraction Funnel Metrics:")
for i, interaction in enumerate(interaction_order):
    count = funnel_data[i]
    pct = conversion_rates[i]
    print(f"  {interaction.capitalize():10s}: {count:8,} ({pct:5.1f}% of views)")

# Calculate drop-off rates
print(f"\nKey Conversion Metrics:")
print(f"  View → Like conversion: {(interaction_counts['like']/views)*100:.1f}%")
print(f"  Like → Save conversion: {(interaction_counts['save']/interaction_counts['like'])*100:.1f}%")
print(f"  Overall engagement rate (non-view): {((sum(funnel_data[1:])/views)*100):.1f}%")


Interaction Funnel Metrics:
  View      :   97,859 (100.0% of views)
  Like      :   47,635 ( 48.7% of views)
  Save      :   20,985 ( 21.4% of views)
  Copy      :   17,964 ( 18.4% of views)
  Share     :   14,827 ( 15.2% of views)

Key Conversion Metrics:
  View → Like conversion: 48.7%
  Like → Save conversion: 44.1%
  Overall engagement rate (non-view): 103.6%


## 7. Visualization 5: Session Duration Analysis

**Insight Focus**: How long do users spend on the platform?

In [17]:
# Merge sessions with user segments
sessions_enriched = sessions_df.merge(users_df[['user_id', 'segment']], on='user_id', how='left')

# Create distribution plot
fig = go.Figure()

segments = ['power_user', 'regular_user', 'casual_user']
segment_labels = {'power_user': 'Power Users', 'regular_user': 'Regular Users', 'casual_user': 'Casual Users'}
colors = [BRAND_COLORS['primary'], BRAND_COLORS['info'], BRAND_COLORS['secondary']]

for segment, color in zip(segments, colors):
    data = sessions_enriched[sessions_enriched['segment'] == segment]['duration_minutes']
    fig.add_trace(go.Box(
        y=data,
        name=segment_labels[segment],
        marker_color=color,
        boxmean='sd'
    ))

fig.update_layout(
    title='Session Duration by User Segment',
    yaxis_title='Session Duration (minutes)',
    xaxis_title='User Segment',
    height=500,
    template='plotly_white',
    showlegend=False
)

fig.show()

print("\nSession Duration Statistics by Segment:")
for segment in segments:
    data = sessions_enriched[sessions_enriched['segment'] == segment]['duration_minutes']
    print(f"\n{segment_labels[segment]}:")
    print(f"  Mean: {data.mean():.1f} minutes")
    print(f"  Median: {data.median():.1f} minutes")
    print(f"  Std Dev: {data.std():.1f} minutes")
    print(f"  Total sessions: {len(data):,}")


Session Duration Statistics by Segment:

Power Users:
  Mean: 14.2 minutes
  Median: 12.3 minutes
  Std Dev: 8.7 minutes
  Total sessions: 18,559

Regular Users:
  Mean: 9.5 minutes
  Median: 8.3 minutes
  Std Dev: 5.7 minutes
  Total sessions: 25,714

Casual Users:
  Mean: 5.5 minutes
  Median: 4.9 minutes
  Std Dev: 3.0 minutes
  Total sessions: 12,493


## 8. Visualization 6: Engagement vs Session Duration

**Insight Focus**: Correlation between time spent and engagement.

In [19]:
# Create scatter plot of duration vs quotes generated
fig = px.scatter(
    sessions_enriched,
    x='duration_minutes',
    y='quotes_generated',
    color='segment',
    color_discrete_map={
        'power_user': BRAND_COLORS['primary'],
        'regular_user': BRAND_COLORS['info'],
        'casual_user': BRAND_COLORS['secondary']
    },
    labels={
        'duration_minutes': 'Session Duration (minutes)',
        'quotes_generated': 'Quotes Generated',
        'segment': 'User Segment'
    },
    title='Session Duration vs Quote Generation',
    opacity=0.6,
    trendline='ols',
    height=600
)

fig.update_layout(template='plotly_white')
fig.show()

# Calculate correlation
correlation = sessions_enriched['duration_minutes'].corr(sessions_enriched['quotes_generated'])
print(f"\nCorrelation between session duration and quotes generated: {correlation:.3f}")

# Average quotes per minute by segment
print(f"\nQuotes Generated per Minute by Segment:")
for segment in segments:
    data = sessions_enriched[sessions_enriched['segment'] == segment]
    quotes_per_min = (data['quotes_generated'] / data['duration_minutes']).mean()
    print(f"  {segment_labels[segment]:15s}: {quotes_per_min:.2f} quotes/minute")


Correlation between session duration and quotes generated: 0.248

Quotes Generated per Minute by Segment:
  Power Users    : 0.21 quotes/minute
  Regular Users  : 0.22 quotes/minute
  Casual Users   : 0.33 quotes/minute


## 9. Visualization 7: Category Popularity Over Time

**Insight Focus**: Trending categories throughout the period.

In [20]:
# Weekly category trends
quotes_df['week_start'] = quotes_df['timestamp'].dt.to_period('W').dt.start_time
weekly_category = quotes_df.groupby(['week_start', 'category']).size().reset_index(name='count')

# Create area chart
fig = px.area(
    weekly_category,
    x='week_start',
    y='count',
    color='category',
    color_discrete_map=CATEGORY_COLORS,
    labels={
        'week_start': 'Week',
        'count': 'Number of Quotes',
        'category': 'Category'
    },
    title='Category Popularity Over Time (Weekly)',
    height=600
)

fig.update_layout(template='plotly_white', hovermode='x unified')
fig.show()

# Calculate growth rates
print("\nCategory Growth (First Week vs Last Week):")
first_week = weekly_category[weekly_category['week_start'] == weekly_category['week_start'].min()]
last_week = weekly_category[weekly_category['week_start'] == weekly_category['week_start'].max()]

for category in CATEGORY_COLORS.keys():
    first_count = first_week[first_week['category'] == category]['count'].values
    last_count = last_week[last_week['category'] == category]['count'].values
    
    if len(first_count) > 0 and len(last_count) > 0:
        growth = ((last_count[0] - first_count[0]) / first_count[0]) * 100
        print(f"  {category:15s}: {growth:+6.1f}%")


Category Growth (First Week vs Last Week):
  Motivation     : +6286.1%
  Inspiration    : +7012.3%
  Wellness       : +5952.1%
  Success        : +4944.1%
  Mindfulness    : +6343.8%
  Gratitude      : +7159.1%
  Growth         : +5804.3%


## 10. Visualization 8: User Segment Comparison

**Insight Focus**: Behavioral differences across user segments.

In [21]:
# Calculate metrics by segment
segment_metrics = []

for segment in segments:
    # Users
    segment_users = users_df[users_df['segment'] == segment]['user_id'].tolist()
    
    # Quotes
    segment_quotes = quotes_df[quotes_df['user_id'].isin(segment_users)]
    
    # Sessions
    segment_sessions = sessions_df[sessions_df['user_id'].isin(segment_users)]
    
    # Interactions (non-view)
    segment_interactions = interactions_df[
        (interactions_df['user_id'].isin(segment_users)) & 
        (interactions_df['interaction_type'] != 'view')
    ]
    
    segment_metrics.append({
        'Segment': segment_labels[segment],
        'Users': len(segment_users),
        'Avg Quotes/User': len(segment_quotes) / len(segment_users),
        'Avg Sessions/User': len(segment_sessions) / len(segment_users),
        'Avg Duration (min)': segment_sessions['duration_minutes'].mean(),
        'Avg Interactions/Quote': len(segment_interactions) / len(segment_quotes) if len(segment_quotes) > 0 else 0
    })

segment_comparison = pd.DataFrame(segment_metrics)

# Create grouped bar chart
fig = make_subplots(
    rows=2, cols=2,
    subplot_titles=(
        'Avg Quotes per User',
        'Avg Sessions per User',
        'Avg Session Duration',
        'Avg Interactions per Quote'
    ),
    specs=[[{'type': 'bar'}, {'type': 'bar'}],
           [{'type': 'bar'}, {'type': 'bar'}]]
)

# Quotes per user
fig.add_trace(
    go.Bar(x=segment_comparison['Segment'], y=segment_comparison['Avg Quotes/User'],
           marker_color=colors, text=segment_comparison['Avg Quotes/User'].round(1),
           textposition='auto'),
    row=1, col=1
)

# Sessions per user
fig.add_trace(
    go.Bar(x=segment_comparison['Segment'], y=segment_comparison['Avg Sessions/User'],
           marker_color=colors, text=segment_comparison['Avg Sessions/User'].round(1),
           textposition='auto'),
    row=1, col=2
)

# Avg duration
fig.add_trace(
    go.Bar(x=segment_comparison['Segment'], y=segment_comparison['Avg Duration (min)'],
           marker_color=colors, text=segment_comparison['Avg Duration (min)'].round(1),
           textposition='auto'),
    row=2, col=1
)

# Interactions per quote
fig.add_trace(
    go.Bar(x=segment_comparison['Segment'], y=segment_comparison['Avg Interactions/Quote'],
           marker_color=colors, text=segment_comparison['Avg Interactions/Quote'].round(2),
           textposition='auto'),
    row=2, col=2
)

fig.update_layout(
    title_text='User Segment Behavioral Comparison',
    showlegend=False,
    height=700,
    template='plotly_white'
)

fig.show()

print("\nUser Segment Comparison Table:")
print(segment_comparison.to_string(index=False))


User Segment Comparison Table:
      Segment  Users  Avg Quotes/User  Avg Sessions/User  Avg Duration (min)  Avg Interactions/Quote
  Power Users    750        53.197333          24.745333           14.202270                1.319039
Regular Users   1750        22.357714          14.693714            9.521798                0.933574
 Casual Users   2500         7.534000           4.997200            5.542694                0.650757


## 11. Visualization 9: Engagement Correlation Heatmap

**Insight Focus**: Understanding relationships between different metrics.

In [22]:
# Prepare data for correlation analysis
correlation_data = quotes_enriched.merge(
    sessions_df[['user_id', 'duration_minutes']].groupby('user_id').mean().reset_index(),
    on='user_id',
    how='left'
)

# Select numeric columns for correlation
corr_features = correlation_data[[
    'quote_length',
    'engagement_count',
    'duration_minutes'
]].copy()

corr_features.columns = ['Quote Length', 'Engagement Count', 'Avg Session Duration']

# Calculate correlation matrix
corr_matrix = corr_features.corr()

# Create heatmap
fig = go.Figure(data=go.Heatmap(
    z=corr_matrix.values,
    x=corr_matrix.columns,
    y=corr_matrix.columns,
    colorscale='RdBu',
    zmid=0,
    text=corr_matrix.values,
    texttemplate='%{text:.3f}',
    textfont={"size": 14},
    colorbar=dict(title='Correlation')
))

fig.update_layout(
    title='Correlation Matrix: Engagement Metrics',
    height=500,
    template='plotly_white'
)

fig.show()

print("\nCorrelation Insights:")
print(f"  Quote Length ↔ Engagement: {corr_matrix.loc['Quote Length', 'Engagement Count']:.3f}")
print(f"  Session Duration ↔ Engagement: {corr_matrix.loc['Avg Session Duration', 'Engagement Count']:.3f}")
print(f"  Quote Length ↔ Session Duration: {corr_matrix.loc['Quote Length', 'Avg Session Duration']:.3f}")


Correlation Insights:
  Quote Length ↔ Engagement: -0.036
  Session Duration ↔ Engagement: 0.212
  Quote Length ↔ Session Duration: 0.003


## 12. Visualization 10: Treemap - Category Hierarchy

**Insight Focus**: Visual hierarchy of content categories and engagement.

In [23]:
# Prepare treemap data
treemap_data = quotes_enriched.groupby('category').agg({
    'quote_id': 'count',
    'engagement_count': 'sum'
}).reset_index()

treemap_data.columns = ['category', 'quotes', 'engagements']
treemap_data['engagement_per_quote'] = treemap_data['engagements'] / treemap_data['quotes']

# Create treemap
fig = px.treemap(
    treemap_data,
    path=['category'],
    values='quotes',
    color='engagement_per_quote',
    color_continuous_scale='Viridis',
    title='Category Breakdown: Size = Quote Volume, Color = Engagement Rate',
    labels={'engagement_per_quote': 'Engagements/Quote'},
    height=600
)

fig.update_traces(
    textinfo='label+value+percent parent',
    textfont_size=14
)

fig.update_layout(template='plotly_white')
fig.show()

print("\nCategory Engagement Metrics:")
treemap_sorted = treemap_data.sort_values('engagement_per_quote', ascending=False)
for _, row in treemap_sorted.iterrows():
    print(f"  {row['category']:15s}: {row['engagement_per_quote']:.2f} engagements/quote | {row['quotes']:,} quotes | {row['engagements']:,.0f} total engagements")


Category Engagement Metrics:
  Motivation     : 1.22 engagements/quote | 24,478 quotes | 29,878 total engagements
  Inspiration    : 1.12 engagements/quote | 19,603 quotes | 21,865 total engagements
  Success        : 1.03 engagements/quote | 14,693 quotes | 15,085 total engagements
  Wellness       : 0.96 engagements/quote | 14,533 quotes | 13,998 total engagements
  Growth         : 0.89 engagements/quote | 6,834 quotes | 6,079 total engagements
  Mindfulness    : 0.85 engagements/quote | 9,896 quotes | 8,363 total engagements
  Gratitude      : 0.79 engagements/quote | 7,822 quotes | 6,143 total engagements


## Summary

### Key Visualizations Created

1. **Temporal Analysis**: Daily trends and activity heatmaps
2. **Category Performance**: Volume and engagement metrics
3. **User Behavior**: Interaction funnels and session analysis
4. **Segment Comparison**: Behavioral differences across user types
5. **Correlation Analysis**: Relationships between metrics

### Technical Implementation

- **Interactive Charts**: Plotly for dynamic exploration
- **Statistical Plots**: Seaborn for distributions and correlations
- **Color Consistency**: Brand colors throughout for professional appearance
- **Multiple Perspectives**: Time, category, behavior, and segment views

### Next Steps

Proceed to `03_insights_report.ipynb` for narrative analysis and actionable recommendations.