
# Task 7: Visualizing Sales Trends and Customer Behavior

**Objective:** Explore and communicate insights from the Superstore dataset using Python visualizations.

Dataset Path (update if needed):  
`C:\Users\admin\OneDrive\Desktop\skillyt\superstore(in).csv`


In [None]:

# Step 1: Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Settings
sns.set_style("whitegrid")


In [None]:

# Step 2: Load dataset
file_path = r"C:\Users\admin\OneDrive\Desktop\skillyt\superstore(in).csv"
df = pd.read_csv(file_path, encoding='latin1')

print("Data shape:", df.shape)
df.head()


In [None]:

# Step 3: Basic EDA
print(df.info())
df.describe(include="all")


In [None]:

# Step 4: Sales by Category
plt.figure(figsize=(8,6))
sns.barplot(x="Category", y="Sales", data=df, estimator=sum, ci=None)
plt.title("Total Sales by Category")
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()


In [None]:

# Step 5: Sales by Region
plt.figure(figsize=(8,6))
sns.barplot(x="Region", y="Sales", data=df, estimator=sum, ci=None)
plt.title("Total Sales by Region")
plt.tight_layout()
plt.show()


In [None]:

# Step 6: Sales Trend Over Time
df['Order Date'] = pd.to_datetime(df['Order Date'])
sales_trend = df.groupby('Order Date')['Sales'].sum().reset_index()

plt.figure(figsize=(12,6))
sns.lineplot(x="Order Date", y="Sales", data=sales_trend)
plt.title("Sales Trend Over Time")
plt.tight_layout()
plt.show()


In [None]:

# Step 7: Profit vs Discount
plt.figure(figsize=(8,6))
sns.scatterplot(x="Discount", y="Profit", data=df)
plt.title("Profit vs Discount")
plt.tight_layout()
plt.show()


In [None]:

# Step 8: Correlation Heatmap (numeric variables)
plt.figure(figsize=(10,6))
sns.heatmap(df.corr(numeric_only=True), annot=True, cmap="coolwarm", fmt=".2f")
plt.title("Correlation Heatmap")
plt.tight_layout()
plt.show()


In [None]:

# Step 9: Customer Segment Analysis
plt.figure(figsize=(8,6))
sns.barplot(x="Segment", y="Sales", data=df, estimator=sum, ci=None)
plt.title("Total Sales by Customer Segment")
plt.tight_layout()
plt.show()



---
### Insights to Highlight:
- Which category/region generates the most sales?
- Are there seasonal or yearly spikes in sales?
- Does discounting improve or hurt profitability?
- Which customer segment drives the most revenue?

This notebook gives a foundation — you can extend it with dashboards (e.g., Plotly, Power BI) for interactive exploration.
