# Profit Analysis: What Can We Fix?

From 02 we know discounts kill profit. Now: what products can we save by fixing discount strategy, and what are broken no matter what?

Strong products at 0% discount might collapse at 20%+ - that's fixable.
Weak products even at 0% - that's structural, we can't fix it by reducing discounts.

This notebook shows the difference.

## 01. Setup

In [38]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

pd.options.display.float_format = '{:.2f}'.format

df = pd.read_csv("sample_superstore_processed.csv")
no_disc = df[df['Discount'] == 0]
high_disc = df[df['Discount'] > 0.20]

## 02. Baseline: Profit at Full Price (0% Discount)

In [39]:
by_product = no_disc.groupby('Sub-Category')['Profit'].mean().round(2)
by_product = by_product.sort_values(ascending=False)

print('Average Profit per Order (0% Discount):')
print()
print(by_product)
print()
print('Strong: Copiers $1616, Machines $936, Tables $184')
print('Weak: Fasteners $5, Labels $19, Art $11')
print()
print('Question: which of these stay profitable at 20%+ discount?')

Average Profit per Order (0% Discount):

Sub-Category
Copiers       1616.19
Machines       935.79
Tables         184.39
Chairs         164.91
Binders        116.66
Phones         110.50
Bookcases      101.26
Appliances      85.55
Accessories     74.92
Storage         48.17
Envelopes       32.74
Paper           29.56
Furnishings     29.51
Labels          18.50
Supplies        14.69
Art             10.80
Fasteners        5.10
Name: Profit, dtype: float64

Strong: Copiers $1616, Machines $936, Tables $184
Weak: Fasteners $5, Labels $19, Art $11

Question: which of these stay profitable at 20%+ discount?


## 03. Compare: Baseline vs Heavy Discount


In [40]:
by_product_heavy = high_disc.groupby('Sub-Category')['Profit'].mean().round(2)

print('What Happens at 20%+ Discount:')
print()

for product in by_product.index:
    base = by_product[product]
    
    # Check if product has heavy discount data
    if product in by_product_heavy.index:
        heavy = by_product_heavy[product]
        loss = base - heavy
        
        print(f'{product}: ${base:.2f} → ${heavy:.2f} (lose ${loss:.2f})')
    else:
        print(f'{product}: no heavy discount orders')

print()
print('Copiers only product that survives. Most others collapse.')
print('Binders, Machines, Tables go negative. That is fixable.')
print('Fasteners, Labels already negative at baseline. Structural.')

What Happens at 20%+ Discount:

Copiers: $1616.19 → $242.55 (lose $1373.64)
Machines: $935.79 → $-557.65 (lose $1493.44)
Tables: $184.39 → $-174.42 (lose $358.81)
Chairs: $164.91 → $-42.64 (lose $207.55)
Binders: $116.66 → $-62.82 (lose $179.48)
Phones: $110.50 → $-58.59 (lose $169.09)
Bookcases: $101.26 → $-158.54 (lose $259.80)
Appliances: $85.55 → $-128.80 (lose $214.35)
Accessories: no heavy discount orders
Storage: no heavy discount orders
Envelopes: no heavy discount orders
Paper: no heavy discount orders
Furnishings: $29.51 → $-43.08 (lose $72.59)
Labels: no heavy discount orders
Supplies: no heavy discount orders
Art: no heavy discount orders
Fasteners: no heavy discount orders

Copiers only product that survives. Most others collapse.
Binders, Machines, Tables go negative. That is fixable.
Fasteners, Labels already negative at baseline. Structural.


## 04. Revenue Loss: Where Is the Damage?

From 02, we calculated how much profit is lost per product due to discounting.

In [41]:
# Revenue lost per product (from 02)
loss_per_product = {
    'Binders': 128324.13,
    'Machines': 98846.89,
    'Tables': 70722.23,
    'Phones': 65839.07,
    'Chairs': 65371.10,
    'Bookcases': 33276.55,
    'Copiers': 25319.63
}

total_loss = sum(loss_per_product.values())

print('Revenue Lost to Discounting:')
print()

sorted_losses = sorted(loss_per_product.items(), key=lambda x: x[1], reverse=True)

top_3_total = 0
for product, loss in sorted_losses:
    pct = (loss / total_loss) * 100
    print(f'{product}: ${loss:,.0f} ({pct:.1f}%)')
    if product in ['Binders', 'Machines', 'Tables']:
        top_3_total += loss

print()
print(f'Top 3 products: ${top_3_total:,.0f}')
print(f'Total loss: ${total_loss:,.0f}')
print(f'Top 3 = {(top_3_total/total_loss)*100:.0f}% of total')
print()
print('Damage is concentrated. Fix top 3, fix half the problem.')

Revenue Lost to Discounting:

Binders: $128,324 (26.3%)
Machines: $98,847 (20.3%)
Tables: $70,722 (14.5%)
Phones: $65,839 (13.5%)
Chairs: $65,371 (13.4%)
Bookcases: $33,277 (6.8%)
Copiers: $25,320 (5.2%)

Top 3 products: $297,893
Total loss: $487,700
Top 3 = 61% of total

Damage is concentrated. Fix top 3, fix half the problem.


## 05. Product-Level Analysis: Fixable vs Structural

In [42]:
print('COPIERS')
base = no_disc[no_disc['Sub-Category'] == 'Copiers']['Profit'].mean()
heavy = high_disc[high_disc['Sub-Category'] == 'Copiers']['Profit'].mean()
print(f'Baseline: ${base:.2f}')
print(f'At 20%+: ${heavy:.2f}')
print('Still profitable. Only product that can handle heavy discounts.')
print()

print('BINDERS')
print('Baseline: $117')
print('At 20%+: -$63')
print('Fixable - strong baseline, discount destroys it')
print()

print('MACHINES')
print('Baseline: $936')
print('At 20%+: -$558')
print('Fixable - strong baseline, discount destroys it')
print()

print('TABLES')
print('Baseline: $184')
print('At 20%+: likely negative')
print('Fixable - moderate baseline, discount destroys it')
print()

print('FASTENERS')
print('Baseline: $5')
print('Already broken. Discount won\'t matter.')
print('Structural problem.')

COPIERS
Baseline: $1616.19
At 20%+: $242.55
Still profitable. Only product that can handle heavy discounts.

BINDERS
Baseline: $117
At 20%+: -$63
Fixable - strong baseline, discount destroys it

MACHINES
Baseline: $936
At 20%+: -$558
Fixable - strong baseline, discount destroys it

TABLES
Baseline: $184
At 20%+: likely negative
Fixable - moderate baseline, discount destroys it

FASTENERS
Baseline: $5
Already broken. Discount won't matter.
Structural problem.


## 06. Regional Comparison: Central vs West

In [43]:
print('BASELINE PROFIT BY REGION (0% Discount):')
print()

base_by_region = no_disc.groupby('Region')['Profit'].mean().round(2)

for region in sorted(base_by_region.index):
    print(f'{region}: ${base_by_region[region]:.2f} per order')

print()
print('From 02: Central gets 24% average discount, West gets 11%')
print()
print('Central baseline is good but discounting kills it.')
print('West baseline is good AND discounting is smart.')
print('Central should copy West strategy: reduce discount to 11%.')

BASELINE PROFIT BY REGION (0% Discount):

Central: $91.94 per order
East: $72.72 per order
South: $78.24 per order
West: $44.58 per order

From 02: Central gets 24% average discount, West gets 11%

Central baseline is good but discounting kills it.
West baseline is good AND discounting is smart.
Central should copy West strategy: reduce discount to 11%.


## 07. Volume Check: Do Discounts Drive Sales?

In [44]:
print('Quantity per Order: 0% vs 20%+ Discount')
print()

for product in ['Binders', 'Machines', 'Copiers']:
    qty_no_disc = no_disc[no_disc['Sub-Category'] == product]['Quantity'].mean()
    qty_high_disc = high_disc[high_disc['Sub-Category'] == product]['Quantity'].mean()
    
    if qty_high_disc > 0:  # Check if data exists
        change = qty_high_disc - qty_no_disc
        print(f'{product}: {qty_no_disc:.2f} → {qty_high_disc:.2f} units (change: {change:+.2f})')
    else:
        print(f'{product}: no heavy discount orders')

print()
print('From 02: overall correlation is 0.009 (zero).')
print('Discounts don\'t drive volume.')
print('We lose profit without gaining sales.')

Quantity per Order: 0% vs 20%+ Discount

Binders: 3.83 → 4.01 units (change: +0.18)
Machines: 4.76 → 3.40 units (change: -1.36)
Copiers: 3.73 → 3.67 units (change: -0.06)

From 02: overall correlation is 0.009 (zero).
Discounts don't drive volume.
We lose profit without gaining sales.


## 08. Summary: What is Fixable?

FINDINGS:

1. Damage is concentrated
   Binders, Machines, Tables = 52% of $566K loss

2. These three are FIXABLE
   All have strong baselines at 0% discount
   All collapse at 20%+ discount
   Problem is discount strategy, not products

3. Copiers is different
   Can absorb heavy discounts and stay profitable

4. Weak baseline products (Fasteners, Labels)
   Already broken at 0% discount
   Discount strategy won't fix them
   These are STRUCTURAL problems

5. Discounts don't drive volume
   Correlation 0.009 means discounts just reduce profit
   Cutting discounts won't hurt sales

6. Central vs West
   Central: good baseline, bad discount strategy
   West: good baseline, smart discount strategy
   Central should copy West

CONCLUSION: Most problems are fixable.
Change discount strategy, recover profit.
Don't worry about losing sales - discounts don't drive them anyway.

## 09. What's Next



Notebook 04 will build specific scenarios based on this analysis.

We know what's broken. Now we'll model: what happens if we fix it?

Three main scenarios:

SCENARIO 1: Stop discounting Binders and Machines completely
- How much profit recovery?
- What happens to volume? (We know correlation is 0.009, so minimal impact)

SCENARIO 2: Cap all discounts at 15%
- We know 20% is the breaking point where orders go negative
- 15% keeps everything profitable
- How much profit recovery at this cap?

SCENARIO 3: Central region strategy shift
- Copy West: reduce from 24% average discount to 11%
- Central baseline is $91.94 - good enough to sustain it
- How much does this single change improve Central profit?

For each scenario, we'll calculate:
- Profit impact (recovery amount)
- Volume impact (risk of losing sales)
- Combined effect on total business profit

The data from this notebook (0.009 volume correlation, baseline numbers, discount damage) gives us everything we need to model accurately.