# E-Commerce Sales Analysis using NumPy
## Analyzing customer orders, discounts, and revenue trends with NumPy
In this project, we analyze sales data from an e-commerce store using NumPy. 
We'll explore key business metrics such as total revenue, highest discount offered, and average order value.
This project helps in understanding data analysis concepts using NumPy in a real-world scenario.

In [2]:
import numpy as np



In [3]:
# Set random seed for reproducibility
np.random.seed(42)

# Number of records
num_records = 100  

# Generate random customer IDs
customer_ids = np.random.randint(1000, 1100, num_records)

# Product categories (random selection)
product_categories = np.random.choice(
    ['Electronics', 'Clothing', 'Home & Kitchen', 'Beauty', 'Sports'], num_records
)

# Generate random product prices (between $10 and $500)
product_prices = np.random.uniform(10, 500, num_records)

# Generate random discounts (0% to 30%)
discounts = np.random.uniform(0, 30, num_records)  

# Generate random quantities purchased (between 1 and 5)
quantities = np.random.randint(1, 5, num_records)

# Generate orders per month (between 1 and 10)
orders_per_month = np.random.randint(1, 10, num_records)

# Calculate total purchase value after discount
total_purchases = product_prices * quantities * (1 - discounts / 100)

# Combine into a NumPy array
sales_data = np.column_stack((
    customer_ids, product_categories, product_prices, discounts, quantities, total_purchases, orders_per_month
))

# Print first 5 rows
print("Sample Dataset:\n", sales_data[:5])

Sample Dataset:
 [['1051' 'Clothing' '133.37332495442843' '28.61785731007762' '3'
  '285.61421138779184' '4']
 ['1092' 'Electronics' '253.65176788726887' '27.445931706613457' '4'
  '736.1387076012455' '2']
 ['1014' 'Beauty' '157.43037181021714' '11.104761007663331' '1'
  '139.94810526721673' '1']
 ['1071' 'Beauty' '149.57184224495913' '0.46369849586602285' '2'
  '297.7565597244603' '7']
 ['1060' 'Beauty' '28.07460420372107' '27.849556877631763' '2'
  '40.51190267567154' '8']]



What is the total revenue generated (after discounts)?

In [5]:
total_revenue = np.sum(total_purchases)
print("Total Revenue After Discounts: $", total_revenue)

Total Revenue After Discounts: $ 51007.343876437655


What is the average order value (AOV)?

In [7]:
average_order_value = np.mean(total_purchases)
print("Average Order Value (AOV): $", average_order_value)

Average Order Value (AOV): $ 510.07343876437653


Which customer spent the most money?

In [9]:
highest_spender_index = np.argmax(total_purchases)  
highest_spender = customer_ids[highest_spender_index]  
print("Customer who spent the most:", highest_spender)

Customer who spent the most: 1002


️ What is the average discount applied?

In [11]:
average_discount = np.mean(discounts)
print("Average Discount Given: ", average_discount, "%")

Average Discount Given:  15.429087254459805 %


Which product category generated the highest revenue?

In [13]:
unique_categories = np.unique(product_categories)  
category_revenue = {category: np.sum(total_purchases[product_categories == category]) for category in unique_categories}  
highest_revenue_category = max(category_revenue, key=category_revenue.get)  
print("Highest Revenue Generating Category:", highest_revenue_category)

Highest Revenue Generating Category: Electronics


 How many unique customers made purchases?

In [15]:
unique_customers = np.unique(customer_ids)
print("Total Unique Customers:", len(unique_customers))

Total Unique Customers: 61


 What is the most frequently purchased product category?

In [17]:
unique_categories, counts = np.unique(product_categories, return_counts=True)  
most_frequent_category = unique_categories[np.argmax(counts)]  
print("Most Frequently Purchased Category:", most_frequent_category)

Most Frequently Purchased Category: Electronics


Which customer placed the most orders per month?

In [19]:
max_orders_index = np.argmax(orders_per_month)  
customer_most_orders = customer_ids[max_orders_index]  
print("Customer with the most orders per month:", customer_most_orders)

Customer with the most orders per month: 1087


 What is the average quantity of items purchased per order?

In [21]:
average_quantity = np.mean(quantities)
print("Average Quantity Purchased per Order:", average_quantity)

Average Quantity Purchased per Order: 2.45


 What percentage of orders had a discount of 20% or more?

In [23]:
high_discount_orders = np.sum(discounts >= 20)  
percentage_high_discount = (high_discount_orders / num_records) * 100  
print("Percentage of Orders with 20%+ Discount:", percentage_high_discount, "%")

Percentage of Orders with 20%+ Discount: 36.0 %


Which product category had the lowest total revenue?

In [25]:
bottom_category = min(category_revenue, key=category_revenue.get)
print("Lowest Revenue-Generating Category:", bottom_category)

Lowest Revenue-Generating Category: Clothing


What is the total revenue lost due to discounts?

In [27]:
total_revenue_without_discount = np.sum(product_prices * quantities)
total_discount_loss = total_revenue_without_discount - total_revenue

print("Total Revenue Lost Due to Discounts: $", round(total_discount_loss, 2))

Total Revenue Lost Due to Discounts: $ 8881.83
