# 🛒 Sales Data Analysis Project

Welcome to the Sales Analysis project using **Pandas** and **NumPy**.  
In this project, we will clean, analyze, and extract insights from sales data.  
We'll also convert data to NumPy arrays, compute revenues, identify top performers, and more!

---

### 📊 What You Will Learn:
- Data cleaning using Pandas
- NumPy array manipulation
- Grouping and aggregation
- Revenue and performance analysis
- Exporting results to CSV


Import Libraries

In [None]:
import pandas as pd
import numpy as np

Load the Dataset 

In [None]:
df = pd.read_csv("sales_data.csv") df.head()

Data Cleaning

In [None]:
# Drop rows with missing Product or Category
df = df.dropna(subset=["Product", "Category"])

# Fill missing Price with mean
df["Price"] = df["Price"].fillna(df["Price"].mean())

# Fill missing Quantity with 0
df["Quantity"] = df["Quantity"].fillna(0)

 Convert Pandas DataFrame to NumPy Array

In [None]:
np_array = df.to_numpy()
print("Converted DataFrame to NumPy array:")
print(np_array)


 Calculate Total Revenue

In [None]:
df["Revenue"] = df["Price"] * df["Quantity"]
df.head()

Revenue Analysis by Product and Category 

In [None]:
# Revenue per product
print("Total Revenue by Product:")
print(df.groupby("Product")["Revenue"].sum())

# Revenue per category
print("\nTotal Revenue by Category:")
print(df.groupby("Category")["Revenue"].sum())

Above-Average Price Products

In [None]:
average_price = df["Price"].mean()
print("Average Price:", average_price)

above_avg_products = df[df["Price"] > average_price][["Product", "Price"]]
print("\nProducts with Above-Average Prices:")
print(above_avg_products)


 Top 3 Performing Products by Quantity

In [None]:
product_performance = df.groupby("Product")["Quantity"].sum()
top_products = product_performance.sort_values(ascending=False).head(3)
print("Top 3 Performing Products:")
print(top_products)

Export Final Product Revenue to CSV

In [None]:
df.groupby("Product")["Revenue"].sum().to_csv("product_revenue.csv")
print("Product revenue data exported successfully.")

Visualization

In [None]:
import matplotlib.pyplot as plt

# Bar chart of revenue by category
df.groupby("Category")["Revenue"].sum().plot(kind="bar", title="Revenue by Category", ylabel="Revenue", xlabel="Category", colormap='viridis')
plt.grid(True)
plt.tight_layout()
plt.show()