# Data Exploration & Summarization

Is notebook mein hum data ka exploration aur summary statistics karenge.

## Exploratory Analysis

1. Total sales over time show an increasing trend, especially during festive months.
2. The East region has the highest revenue, followed by the West region.
3. Electronics category performs best in October and December.
4. A sudden drop in sales is observed in April 2023.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the data
data = pd.read_csv('sample_data.csv')
data['Order Date'] = pd.to_datetime(data['Order Date'])

## Summary Statistics

In [None]:
# Average monthly sales
avg_monthly_sales = data.groupby(data['Order Date'].dt.to_period('M'))['Sales'].sum().mean()
print("Average Monthly Sales:", round(avg_monthly_sales, 2))

# Maximum units sold in a month
max_units = data.groupby(data['Order Date'].dt.to_period('M'))['Units'].sum().max()
print("Maximum Units Sold in a Month:", max_units)

# Correlation between price and units sold
correlation = data['Price'].corr(data['Units'])
print("Correlation between Price and Units Sold:", round(correlation, 2))

## Line Chart - Total Sales Over Time

In [None]:
monthly_sales = data.groupby(data['Order Date'].dt.to_period('M'))['Sales'].sum()
monthly_sales.index = monthly_sales.index.to_timestamp()

plt.figure(figsize=(10,5))
monthly_sales.plot(kind='line')
plt.title('Total Sales Over Time')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.grid(True)
plt.show()

## Bar Chart - Revenue by Region

In [None]:
region_sales = data.groupby('Region')['Sales'].sum().sort_values(ascending=False)

plt.figure(figsize=(8,5))
region_sales.plot(kind='bar')
plt.title('Revenue by Region')
plt.xlabel('Region')
plt.ylabel('Sales')
plt.show()

## Heatmap - Sales by Category and Month

In [None]:
heatmap_data = data.pivot_table(values='Sales', index=data['Order Date'].dt.month, columns='Category', aggfunc='sum')

plt.figure(figsize=(10,6))
sns.heatmap(heatmap_data, annot=True, fmt=".0f", cmap='YlGnBu')
plt.title('Sales by Category and Month')
plt.xlabel('Product Category')
plt.ylabel('Month')
plt.show()