# Final Project Summary Dashboard

This notebook presents a visual summary of the key insights from the Instacart user and product behavior analysis. It compiles findings from previous notebooks, highlighting user segments, retention behavior, and product ordering patterns in a concise, dashboard-style format.

In [46]:
# -----------------------------
# Import Libraries and Data
# -----------------------------
import pandas as pd
import plotly.express as px
import sqlite3
import plotly.io as pio    


# Load SQLite connection and data
conn = sqlite3.connect(":memory:")

# Load all needed datasets
user_segments = pd.read_csv("../data/processed/user_segments_summary.csv")
retention = pd.read_csv("../data/processed/user_retention_periods.csv")         # Your period_offset vs user_count data
top_products = pd.read_csv("../data/processed/top_20_products.csv")
top_reordered = pd.read_csv("../data/processed/top_reordered_products.csv")
dept_orders = pd.read_csv("../data/processed/department_order_counts.csv")
basket_sizes = pd.read_csv("../data/processed/basket_sizes.csv")


### User Segments Summary

This chart shows the distribution of users across behavior-based segments using RFM logic. Most users fall under “Loyal Users” and “One-Time Shoppers,” while high-value “Champions” make up a smaller but important group.

In [47]:
user_segments = pd.read_csv("../data/processed/user_segments_summary.csv") # Assuming this is loaded earlier
fig_user_segments = px.bar(
    user_segments,
    x="segment_label",
    y="user_count",
    title="User Segments (RFM-Based)",
    labels={"segment_label": "Segment", "user_count": "Users"},
    color="user_count",
    color_continuous_scale="Blues"
)
fig_user_segments.update_layout(plot_bgcolor='white', title_font=dict(size=20), margin=dict(t=60))

###  User Retention Trends

Retention declines steadily after the third order, highlighting a key drop-off point where engagement strategies may be most effective. Users who place a second and third order tend to remain active for longer.

In [48]:
import plotly.io as pio
fig = px.line(
    retention,
    x="period_offset",
    y="active_users",
    markers=True,
    title="User Retention by Order Period",
    labels={"period_offset": "Orders Since First", "active_users": "Users"}
)
fig.update_traces(line=dict(width=3), marker=dict(size=8))

###  Top 20 Most Ordered Products

This chart displays the 20 products with the highest total order volume. These items represent the most commonly purchased goods on the platform and suggest which products drive the bulk of user activity. Insights from this chart can inform stocking priorities and product placement.

In [49]:
import plotly.io as pio
fig = px.bar(
    top_products.sort_values("total_orders", ascending=True),
    x="total_orders",
    y="product_name",
    orientation="h",
    title="Top 20 Most Ordered Products",
    labels={"total_orders": "Orders", "product_name": "Product"},
    color="total_orders",
    color_continuous_scale="Blues"
)
fig.update_layout(yaxis=dict(tickfont=dict(size=11)), plot_bgcolor='white', margin=dict(t=60))
import plotly.io as pio
pio.write_html(fig, "top_20_most_ordered_products.html", full_html=True, include_plotlyjs='cdn')
fig.update_traces(marker_line_color='black', marker_line_width=0.5)
fig.show()

###  Top 20 Most Reordered Products

This chart highlights products with the highest reorder rates among those with at least 100 prior orders. A high reorder rate indicates consistent demand and user loyalty toward certain products. These may be ideal candidates for promotional campaigns or automated reordering suggestions.

In [50]:
import plotly.io as pio
fig = px.bar(
    top_reordered.sort_values("reorder_rate", ascending=True),
    x="reorder_rate",
    y="product_name",
    orientation="h",
    title="Top 20 Most Reordered Products",
    labels={"reorder_rate": "Reorder Rate", "product_name": "Product"},
    color="reorder_rate",
    color_continuous_scale="Greens"
)
fig.update_layout(yaxis=dict(tickfont=dict(size=11)), plot_bgcolor='white', margin=dict(t=60))
pio.write_html(fig, "../all_visuals/top_20_most_reordered_products.html", full_html=True, include_plotlyjs='cdn')
fig.update_traces(marker_line_color='black', marker_line_width=0.5)
fig.show()

###  Basket Size Distribution

This histogram shows how many products users typically include in a single order. Most baskets contain under 20 items, indicating frequent but small-to-medium-sized trips. Understanding this distribution helps align delivery operations and marketing with actual user behavior.

In [51]:
import plotly.io as pio
fig = px.histogram(
    basket_sizes,
    x="basket_size",
    nbins=30,
    title="Basket Size Distribution",
    labels={"basket_size": "Products per Order"},
    color_discrete_sequence=["#636EFA"]
)
fig.update_layout(plot_bgcolor='white', title_font=dict(size=20), margin=dict(t=60))
pio.write_html(fig, "../all_visuals/basket_size_distribution.html", full_html=True, include_plotlyjs='cdn')
fig.update_traces(marker_line_color='black', marker_line_width=0.5)
fig.show()

## Final Observations

- Most Instacart users fall into the "loyal" or "one-time" categories, with few high-value repeat customers.
- Retention sharply declines after the third order, which is a likely inflection point for re-engagement.
- A small subset of products accounts for the majority of orders and reorders, suggesting strong SKU concentration.
- Baskets are generally modest in size, reflecting consistent routine shopping behavior rather than bulk purchases.

These findings can support marketing, inventory, and user lifecycle decisions.
