<a href="https://colab.research.google.com/github/satyakala-teja/analytics-capstone-satyakala/blob/main/notebooks/05_advanced_insights_business_storytelling.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Notebook 05 — Advanced Insights & Business Storytelling
This notebook generates advanced business insights from the sales dataset, including:
- Customer Segmentation
- Product Profitability Ranking
- Region Performance Mapping
- Trend Forecast Preparation
- Business Questions + Insights (Storytelling Mode)

This is part of a complete BI/Analytics portfolio for interviews.


In [12]:
import pandas as pd

# Load dataset
df = pd.read_csv('/content/data/sales_data.csv')

df.head()


Unnamed: 0,order_id,order_date,customer_id,category,sub_category,product,quantity,unit_price,sales,region
0,1001,2023-01-02,C001,Office Supplies,Binders,Elastic Binder,2,5.0,10.0,East
1,1002,2023-01-03,C002,Furniture,Chairs,Ergo Chair,1,150.0,150.0,West
2,1003,2023-01-04,C003,Technology,Phones,SmartPhone X,1,700.0,700.0,North
3,1004,2023-01-05,C001,Office Supplies,Paper,Copy Paper,10,3.5,35.0,East
4,1005,2023-01-06,C004,Technology,Laptops,UltraBook Pro,1,1200.0,1200.0,South


## Step 3 — Customer Segmentation (High / Medium / Low Value Customers)
We segment customers based on their total revenue contribution.
This analysis helps identify VIP customers and retention priorities.


In [11]:
# Customer wise revenue using existing 'sales' column
customer_revenue = df.groupby('customer_id')['sales'].sum().reset_index()

# Segmentation function
def segment_customer(value):
    if value > customer_revenue['sales'].quantile(0.75):
        return 'High Value'
    elif value > customer_revenue['sales'].quantile(0.50):
        return 'Medium Value'
    else:
        return 'Low Value'

customer_revenue['segment'] = customer_revenue['sales'].apply(segment_customer)

customer_revenue.head(10)


Unnamed: 0,customer_id,sales,segment
0,C001,45.0,Low Value
1,C002,150.0,Low Value
2,C003,700.0,Medium Value
3,C004,1200.0,High Value


## Step 4 — Product Profitability Ranking
We rank products based on their total revenue.
This analysis helps the business identify:
- Top performing products
- Low revenue products that may need promotion
- Inventory planning decisions
- High-demand product categories


In [13]:
# Product wise total revenue
product_profit = df.groupby('product')['sales'].sum().reset_index()
# Sort products from highest to lowest revenue
product_profit = product_profit.sort_values(by='sales', ascending=False)

# Display top 10 products
product_profit.head(10)



Unnamed: 0,product,sales
4,UltraBook Pro,1200.0
3,SmartPhone X,700.0
2,Ergo Chair,150.0
0,Copy Paper,35.0
1,Elastic Binder,10.0


## Step 5 — Region Performance Analysis
We analyze sales distribution across regions to understand:
- Which regions contribute the most revenue
- Areas needing marketing or operational improvements
- Geographic sales patterns for business expansion


In [9]:
# Region wise total revenue
region_performance = df.groupby('region')['sales'].sum().reset_index()

# Sort regions from highest to lowest revenue
region_performance = region_performance.sort_values(by='sales', ascending=False)

region_performance


Unnamed: 0,region,sales
2,South,1200.0
1,North,700.0
3,West,150.0
0,East,45.0


## Step 6 — Business Storytelling Insights

In this section, we convert raw numbers into meaningful insights for business stakeholders.
The goal is to answer:
- What is happening in the business?
- Why is it happening?
- What actions should the business take next?

These insights help business leaders make decisions, improve sales, reduce losses, and find opportunities.


In [14]:
# Automatic business insights generation

insights = {}

# 1. Best performing region
best_region = region_performance.iloc[0]
insights['best_region'] = f"The highest performing region is **{best_region['region']}** with total revenue of **{best_region['sales']}**."

# 2. Low performing region
low_region = region_performance.iloc[-1]
insights['low_region'] = f"The lowest performing region is **{low_region['region']}**, contributing only **{low_region['sales']}** in revenue."

# 3. Highest revenue product
best_product = product_profit.iloc[0]
insights['best_product'] = f"The best-selling product is **{best_product['product']}** with revenue of **{best_product['sales']}**."

# 4. Customer segments summary
high_value_count = customer_revenue[customer_revenue['segment'] == 'High Value'].shape[0]
medium_value_count = customer_revenue[customer_revenue['segment'] == 'Medium Value'].shape[0]
low_value_count = customer_revenue[customer_revenue['segment'] == 'Low Value'].shape[0]

insights['customer_split'] = (
    f"Customer segmentation results: **{high_value_count} High Value**, "
    f"**{medium_value_count} Medium Value**, and **{low_value_count} Low Value customers**."
)

# Display all insights
for key, value in insights.items():
    print(f"- {value}")


- The highest performing region is **South** with total revenue of **1200.0**.
- The lowest performing region is **East**, contributing only **45.0** in revenue.
- The best-selling product is **UltraBook Pro** with revenue of **1200.0**.
- Customer segmentation results: **1 High Value**, **1 Medium Value**, and **2 Low Value customers**.


**Business Storytelling Summary**

Based on the sales dataset, here is the business story that describes what is happening, why it is happening, and what actions the business should take:

1. Regional Performance Summary

The analysis shows that overall sales performance is not evenly distributed across regions.
Some regions are significantly outperforming others, indicating stronger customer presence, better demand, or efficient operations.
Low-performing regions need attention in terms of marketing, product visibility, and logistics improvement.

2. Product Profitability Insights

Certain products generate much higher revenue than others.
These top-selling products are the backbone of the business and should be supported with:

Inventory priority

Targeted advertising

Special seasonal promotions

Low-performing products may require:

Marketing campaigns

Price adjustments

Bundling with high-performing items

Or removal from the catalog if unprofitable

3. Customer Segmentation Analysis

Customers were segmented into High, Medium, and Low value groups based on their total spending.
The presence of High-Value customers indicates a strong customer base that is willing to spend more.

Actionable insights:

High-Value customers → Retention strategies, loyalty programs

Medium-Value customers → Upsell & cross-sell

Low-Value customers → Awareness campaigns, introductory offers

4. Time-Based Business Trends

Month-wise and time-based patterns reveal variations in sales behavior.
Understanding these patterns helps the business plan:

Seasonal demand

Stock management

Promotions aligned with high-demand months

5. Weekend Behavioral Observations

Orders placed on weekends may show distinct patterns.
This behavior can guide:

Weekend promotions

Delivery planning

Staffing decisions

6. Overall Business Conclusion

The business has strong potential in specific regions and with specific products.
Focusing on:

High-performing products

Strong regions

High-value customers

will maximize revenue and accelerate business growth.

Underperforming areas present clear opportunities for improvement through targeted strategies.