In [1]:
# Retail Data – Business Questions (for `retail_data`)

**Assumed columns:** `[StoreID, Sales, Customers, Inventory, Returns]`

## Basic Performance Analysis
1. What is the **total revenue** generated across all stores?
2. Which store recorded the **highest sales**?
3. Which store recorded the **lowest sales**?
4. What is the **average daily sales** per store?
5. How many stores achieved sales **above ₹35,000**?

## Customer Insights
6. Which store had the **highest customer footfall**?
7. What is the **average number of customers** per store?
8. How many stores have **more than 450 customers**?
9. Which store has the **lowest customer count**?
10. Is there any store where **high sales do not correspond to high customers**?

## Inventory & Returns Analysis
11. What is the **total inventory** across all stores?
12. Which store has the **highest inventory stock**?
13. How many stores have **inventory less than 1300 units**?
14. Which store has the **highest return rate** (returns ÷ inventory)?
15. Which stores have **returns greater than 15 units**?

## Sales & Customer Correlation
16. Is there a **positive correlation** between sales and customers?
17. Which store has **high customer visits but low sales**?
18. Which store has **low customers but high sales**?
19. How many stores have **sales per customer above ₹90**?
20. Which stores could **benefit from customer acquisition campaigns** (low customers, high inventory)?

## Profitability & Growth
21. If profit margin is **20% of sales**, what is the **profit per store**?
22. Which store has the **highest profit**?
23. Which store has the **lowest profit**?
24. If sales increase by **10% next month**, what are the **new sales figures**?
25. How will **total profit** change with the 10% sales increase?

## Operational Strategy
26. Which stores need **inventory restocking** based on sales and current stock?
27. Which stores show **high returns percentage** and might require **quality checks**?
28. Which store is the **best performer overall** (sales, customers, inventory turnover)?
29. Which store is the **worst performer overall**?
30. Which stores can be considered for **expansion** based on strong sales and customer metrics?


SyntaxError: invalid character '÷' (U+00F7) (ipython-input-322322983.py, line 23)

In [3]:
import numpy as np

# Manually created array: [Store ID, Sales, Customers, Inventory, Returns]
retail_data = np.array([
    [1, 25000, 300, 1200, 10],
    [2, 32000, 450, 1500, 15],
    [3, 28000, 380, 1100, 8],
    [4, 41000, 500, 1600, 20],
    [5, 35000, 420, 1400, 12],
    [6, 27000, 360, 1300, 9],
    [7, 39000, 480, 1700, 18],
    [8, 30000, 400, 1250, 14],
    [9, 45000, 550, 1800, 25],
    [10, 33000, 410, 1450, 11]
])

print(retail_data)


[[    1 25000   300  1200    10]
 [    2 32000   450  1500    15]
 [    3 28000   380  1100     8]
 [    4 41000   500  1600    20]
 [    5 35000   420  1400    12]
 [    6 27000   360  1300     9]
 [    7 39000   480  1700    18]
 [    8 30000   400  1250    14]
 [    9 45000   550  1800    25]
 [   10 33000   410  1450    11]]


In [4]:
retail_data[:,1]

array([25000, 32000, 28000, 41000, 35000, 27000, 39000, 30000, 45000,
       33000])

In [6]:
# 1. What is the **total revenue** generated across all stores?
total_revenue = np.sum(retail_data[:, 1])
print(total_revenue)


335000


In [7]:
# 2. Which store recorded the **highest sales**?
highest_sales=retail_data[:,1].argmax()+1
print(highest_sales)

9


In [8]:
# 3. Which store recorded the **lowest sales**?
lowest_sales=retail_data[:,1].argmin()+1
print(lowest_sales)

1


In [9]:
# 4. What is the **average daily sales** per store?
average_daily_sales=retail_data[:,1].mean()
print(average_daily_sales)

33500.0


In [10]:
# 5. How many stores achieved sales **above ₹35,000**?
sales=retail_data[:,1]

count_above_35k=np.sum(sales>35000)
print(count_above_35k)

3


In [11]:
# 6. Which store had the **highest customer footfall**?
retail_data[:,2].argmax()+1

np.int64(9)

In [12]:
# 7. What is the **average number of customers** per store?
retail_data[:,2].mean()

np.float64(425.0)

In [13]:
# 8. How many stores have **more than 450 customers**?
customers=retail_data[:,2]

count_above_450=np.sum(customers>450)
print(count_above_450)

3


In [14]:
# 9. Which store has the **lowest customer count**?

customers=retail_data[:,2].argmin()+1
print(customers)

1


In [18]:
retail_data[:,1].mean()


np.float64(33500.0)

In [19]:
retail_data[:,2].mean()


np.float64(425.0)

In [16]:
# 10. Is there any store where **high sales do not correspond to high customers**?

sal = retail_data[:,1].mean()
cus = retail_data[:,2].mean()
mis = retail_data[(retail_data[:,1] > sal) & (retail_data[:,2] < cus)]
mis

array([[    5, 35000,   420,  1400,    12]])

In [20]:
# 11. What is the **total inventory** across all stores?
total_inventory=sum(retail_data[:,3])
print(total_inventory)

14300


In [21]:
# 12. Which store has the **highest inventory stock**?
retail_data[:,3].max()

np.int64(1800)

In [24]:
# 13. How many stores have **inventory less than 1300 units**?
inventory=retail_data[:,3]

inventoy_less_than_1300=np.sum(inventory < 1300)
print(inventoy_less_than_1300)


3


In [29]:
# 14. Which store has the **highest return rate** (returns ÷ inventory)?

# Extract columns
returns = retail_data[:, 4]
inventory = retail_data[:, 3]

# Calculate return rate
return_rate = returns / inventory

# Get index of max return rate
max_index = np.argmax(return_rate)

print(max_index)


8


In [31]:
# 15. Which stores have **returns greater than 15 units**?
returns=retail_data[:,4]

returns_greater_than_15=np.sum(returns > 15)
print(returns_greater_than_15)

3


In [35]:
# 16. Is there a **positive correlation** between sales and customers?
sales=retail_data[:,1]
customers=retail_data[:,2]

correlation_matrix=np.corrcoef(sales,customers)
print(correlation_matrix[0,1])

0.9593636912502479


In [40]:
# 17. Which store has **high customer visits but low sales**?
first = retail_data[:,2].mean()
sec = retail_data[:,1].mean()
res = retail_data[(retail_data[:,2] > first) & (retail_data[:,1] < sec)]
res

array([[    2, 32000,   450,  1500,    15]])

In [46]:
#  18. Which store has **low customers but high sales**?
# Extract columns
sales = retail_data[:, 1].mean()
customers = retail_data[:, 2].mean()


# Condition: low customers, high sales
stores_low_cust_high_sales = retail_data[(retail_data[:,2] < customers) & (retail_data[:,1] > sales)]

print("Stores with low customers but high sales:\n", stores_low_cust_high_sales)


Stores with low customers but high sales:
 [[    5 35000   420  1400    12]]


In [48]:
# 20. Which stores could **benefit from customer acquisition campaigns** (low customers, high inventory)?
# Extract columns
customers = retail_data[:, 2].mean()
inventory = retail_data[:, 3].mean()



# Condition: low customers AND high inventory
stores_to_target = retail_data[(retail_data[:, 2] < avg_customers) & (retail_data[:, 3] > avg_inventory)]


print("Stores that could benefit from acquisition campaigns:\n", stores_to_target)


Stores that could benefit from acquisition campaigns:
 [[   10 33000   410  1450    11]]


In [49]:
# If profit margin is **20% of sales**, what is the **profit per store**?
# Extract sales column
sales = retail_data[:, 1]

# Calculate profit per store
profit_per_store = sales * 0.20

print("Profit per store:\n", profit_per_store)


Profit per store:
 [5000. 6400. 5600. 8200. 7000. 5400. 7800. 6000. 9000. 6600.]


In [50]:
# 22. Which store has the **highest profit**?
# Extract sales column
sales = retail_data[:, 1]

# Calculate profit per store
profit_per_store = sales * 0.20

max_index = np.argmax(profit_per_store)
print(max_index)

8


In [51]:
# 23. Which store has the **lowest profit**?
sales = retail_data[:, 1]

# Calculate profit per store
profit_per_store = sales * 0.20

max_index = np.argmin(profit_per_store)
print(max_index)

0


In [52]:
# 24. If sales increase by **10% next month**, what are the **new sales figures**?
# Extract sales column
sales = retail_data[:, 1]

# Calculate new sales after 10% increase
new_sales = sales * 1.10

print("New sales figures after 10% increase:\n", new_sales)


New sales figures after 10% increase:
 [27500. 35200. 30800. 45100. 38500. 29700. 42900. 33000. 49500. 36300.]


In [54]:
# 25. How will **total profit** change with the 10% sales increase?

# Extract sales column
sales = retail_data[:, 1]

# Old total profit
old_total_profit = (sales * 0.20).sum()

# New sales after 10% increase
new_sales = sales * 1.10

# New total profit
new_total_profit = (new_sales * 0.20).sum()

# Change in profit
profit_change = new_total_profit - old_total_profit

print("Old Total Profit:", old_total_profit)
print("New Total Profit:", new_total_profit)
print("Profit Change:", profit_change)


Old Total Profit: 67000.0
New Total Profit: 73700.0
Profit Change: 6700.0


In [55]:
# 26.Which stores need **inventory restocking** based on sales and current stock?
# Extract sales and inventory columns
sales = retail_data[:, 1]
inventory = retail_data[:, 3]

# Condition: sales greater than inventory
stores_to_restock = retail_data[sales > inventory]

print("Stores needing restocking based on sales vs inventory:\n", stores_to_restock)


Stores needing restocking based on sales vs inventory:
 [[    1 25000   300  1200    10]
 [    2 32000   450  1500    15]
 [    3 28000   380  1100     8]
 [    4 41000   500  1600    20]
 [    5 35000   420  1400    12]
 [    6 27000   360  1300     9]
 [    7 39000   480  1700    18]
 [    8 30000   400  1250    14]
 [    9 45000   550  1800    25]
 [   10 33000   410  1450    11]]


In [56]:
# 27. Which stores show **high returns percentage** and might require **quality checks**?
# Extract columns
sales = retail_data[:, 1]
returns = retail_data[:, 4]

# Calculate returns percentage
returns_percentage = (returns / sales) * 100

# Let's say threshold = 0.05%
threshold = 0.05

# Stores with high returns percentage
stores_high_returns = retail_data[returns_percentage > threshold]

print("Returns percentage per store:", returns_percentage)
print("\nStores with high returns percentage:\n", stores_high_returns)


Returns percentage per store: [0.04       0.046875   0.02857143 0.04878049 0.03428571 0.03333333
 0.04615385 0.04666667 0.05555556 0.03333333]

Stores with high returns percentage:
 [[    9 45000   550  1800    25]]


In [59]:
# 28. Which store is the **best performer overall** (sales, customers, inventory turnover)?
import numpy as np

# Extract relevant columns
sales = retail_data[:, 1]
customers = retail_data[:, 2]
inventory = retail_data[:, 3]

# Calculate inventory turnover
inventory_turnover = sales / inventory

# Normalize each metric (0–1 scale)
sales_norm = (sales - sales.min()) / (sales.max() - sales.min())
customers_norm = (customers - customers.min()) / (customers.max() - customers.min())
turnover_norm = (inventory_turnover - inventory_turnover.min()) / (inventory_turnover.max() - inventory_turnover.min())

# Weighted score (equal weight for simplicity)
overall_score = sales_norm + customers_norm + turnover_norm

# Find best performer
best_index = np.argmax(overall_score)+1

print(f"Best performer overall is Store: {best_index}")


Best performer overall is Store: 9


In [60]:
# 29. Which store is the **worst performer overall**?
worst_index = np.argmin(overall_score)+1

print(f"Best performer overall is Store: {worst_index}")


Best performer overall is Store: 1


In [61]:
# 30. Which stores can be considered for **expansion** based on strong sales and customer metrics?
# Calculate averages
avg_sales = np.mean(retail_data[:, 1])
avg_customers = np.mean(retail_data[:, 2])

# Filter stores meeting both criteria
expansion_candidates = retail_data[
    (retail_data[:, 1] > avg_sales) &
    (retail_data[:, 2] > avg_customers)
][:, 0]

print("Stores suitable for expansion:", expansion_candidates)


Stores suitable for expansion: [4 7 9]
