# üìä Feb 10: Sales Analysis Project

**Scenario**: You're analyzing sales data for TechMart, a retail electronics company.

**Goal**: Provide insights and recommendations for strategic decision-making.

## Setup and Data Generation

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta

# Set style
sns.set_theme(style="whitegrid", palette="husl")
plt.rcParams['figure.dpi'] = 100
%matplotlib inline

# Set random seed
np.random.seed(42)

print("‚úÖ Libraries imported successfully!")

In [None]:
# Generate realistic sales dataset
n_orders = 2000

# Product catalog
products = {
    'Electronics': ['Laptop Pro', 'Tablet X', 'Smartphone Z', 'Monitor 4K', 'Desktop PC'],
    'Accessories': ['Wireless Mouse', 'Keyboard RGB', 'USB Hub', 'Webcam HD', 'Phone Case'],
    'Software': ['Office Suite', 'Antivirus Pro', 'Design Studio', 'Video Editor', 'Cloud Storage'],
    'Gaming': ['Gaming Console', 'VR Headset', 'Gaming Chair', 'Controller Pro', 'Gaming Keyboard'],
    'Audio': ['Headphones Pro', 'Speakers 5.1', 'Microphone USB', 'Earbuds Wireless', 'Soundbar']
}

# Generate dates (last 12 months)
start_date = datetime(2024, 2, 1)
dates = [start_date + timedelta(days=np.random.randint(0, 365)) for _ in range(n_orders)]

# Generate product data
categories = np.random.choice(list(products.keys()), n_orders, p=[0.35, 0.25, 0.15, 0.15, 0.10])
product_names = [np.random.choice(products[cat]) for cat in categories]

# Regions
regions = np.random.choice(['North', 'South', 'East', 'West', 'Central'], n_orders)

# Quantities (influenced by category)
quantities = []
for cat in categories:
    if cat in ['Electronics', 'Gaming']:
        quantities.append(np.random.randint(1, 4))  # Lower quantities for expensive items
    else:
        quantities.append(np.random.randint(1, 8))  # Higher for accessories

# Unit prices (vary by category)
unit_prices = []
for cat in categories:
    if cat == 'Electronics':
        unit_prices.append(np.random.uniform(300, 1500))
    elif cat == 'Gaming':
        unit_prices.append(np.random.uniform(200, 800))
    elif cat == 'Audio':
        unit_prices.append(np.random.uniform(50, 400))
    elif cat == 'Software':
        unit_prices.append(np.random.uniform(30, 200))
    else:  # Accessories
        unit_prices.append(np.random.uniform(10, 100))

# Calculate total sales
total_sales = [q * p for q, p in zip(quantities, unit_prices)]

# Customer types
customer_types = np.random.choice(['New', 'Returning'], n_orders, p=[0.35, 0.65])

# Payment methods
payment_methods = np.random.choice(
    ['Credit Card', 'Debit Card', 'PayPal', 'Cash'], 
    n_orders, 
    p=[0.45, 0.30, 0.20, 0.05]
)

# Create DataFrame
df = pd.DataFrame({
    'OrderID': [f'ORD{i:05d}' for i in range(1, n_orders + 1)],
    'OrderDate': dates,
    'ProductCategory': categories,
    'ProductName': product_names,
    'Region': regions,
    'Quantity': quantities,
    'UnitPrice': unit_prices,
    'TotalSales': total_sales,
    'CustomerType': customer_types,
    'PaymentMethod': payment_methods
})

# Sort by date
df = df.sort_values('OrderDate').reset_index(drop=True)

# Add time-based columns
df['Year'] = df['OrderDate'].dt.year
df['Month'] = df['OrderDate'].dt.month
df['MonthName'] = df['OrderDate'].dt.strftime('%b')
df['Quarter'] = df['OrderDate'].dt.quarter
df['YearMonth'] = df['OrderDate'].dt.to_period('M')

print("‚úÖ Dataset created successfully!")
print(f"\nüìä Dataset Shape: {df.shape}")
print(f"üìÖ Date Range: {df['OrderDate'].min().date()} to {df['OrderDate'].max().date()}")
print(f"üí∞ Total Revenue: ${df['TotalSales'].sum():,.2f}")

## Step 1: Data Exploration

Let's start by understanding our dataset.

In [None]:
# Display first few rows
print("=" * 80)
print("FIRST 10 ROWS")
print("=" * 80)
display(df.head(10))

In [None]:
# Dataset information
print("=" * 80)
print("DATASET INFORMATION")
print("=" * 80)
print(df.info())

print("\n" + "=" * 80)
print("MISSING VALUES")
print("=" * 80)
print(df.isnull().sum())

print("\n" + "=" * 80)
print("DUPLICATE ORDERS")
print("=" * 80)
print(f"Number of duplicates: {df.duplicated().sum()}")

In [None]:
# Statistical summary
print("=" * 80)
print("STATISTICAL SUMMARY")
print("=" * 80)
display(df[['Quantity', 'UnitPrice', 'TotalSales']].describe())

In [None]:
# Categorical distributions
print("=" * 80)
print("CATEGORICAL DISTRIBUTIONS")
print("=" * 80)

print("\nüì¶ Product Categories:")
print(df['ProductCategory'].value_counts())

print("\nüåç Regions:")
print(df['Region'].value_counts())

print("\nüë• Customer Types:")
print(df['CustomerType'].value_counts())

print("\nüí≥ Payment Methods:")
print(df['PaymentMethod'].value_counts())

## Step 2: Sales Trends Analysis

**Question**: What is the overall sales trend? Are there seasonal patterns?

In [None]:
# Monthly sales aggregation
monthly_sales = df.groupby('YearMonth')['TotalSales'].agg([
    ('Total_Revenue', 'sum'),
    ('Num_Orders', 'count'),
    ('Avg_Order_Value', 'mean')
]).reset_index()

monthly_sales['YearMonth'] = monthly_sales['YearMonth'].astype(str)

print("üìä Monthly Sales Summary:")
display(monthly_sales)

# Calculate growth rate
monthly_sales['Growth_Rate'] = monthly_sales['Total_Revenue'].pct_change() * 100

print(f"\nüìà Average Monthly Growth Rate: {monthly_sales['Growth_Rate'].mean():.2f}%")
print(f"üîù Best Month: {monthly_sales.loc[monthly_sales['Total_Revenue'].idxmax(), 'YearMonth']}")
print(f"üîª Worst Month: {monthly_sales.loc[monthly_sales['Total_Revenue'].idxmin(), 'YearMonth']}")

In [None]:
# Visualization: Monthly Sales Trend
fig, ax = plt.subplots(figsize=(14, 6))

ax.plot(monthly_sales['YearMonth'], monthly_sales['Total_Revenue'], 
        marker='o', linewidth=2.5, markersize=8, color='#3498db')
ax.fill_between(range(len(monthly_sales)), monthly_sales['Total_Revenue'], 
                alpha=0.3, color='#3498db')

ax.set_xlabel('Month', fontsize=12, fontweight='bold')
ax.set_ylabel('Total Revenue ($)', fontsize=12, fontweight='bold')
ax.set_title('Monthly Sales Trend - TechMart 2024-2025', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, linestyle='--')
ax.tick_params(axis='x', rotation=45)

# Format y-axis as currency
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# Add average line
avg_revenue = monthly_sales['Total_Revenue'].mean()
ax.axhline(y=avg_revenue, color='red', linestyle='--', linewidth=2, alpha=0.7, label=f'Average: ${avg_revenue:,.0f}')
ax.legend(fontsize=10)

plt.tight_layout()
plt.show()

print("\nüí° Insight: Analyze the trend - is it growing, stable, or declining?")

## Step 3: Product Performance Analysis

**Question**: Which products and categories drive the most revenue?

In [None]:
# Sales by category
category_sales = df.groupby('ProductCategory').agg({
    'TotalSales': 'sum',
    'OrderID': 'count',
    'Quantity': 'sum'
}).rename(columns={'OrderID': 'NumOrders'}).sort_values('TotalSales', ascending=False)

category_sales['AvgOrderValue'] = category_sales['TotalSales'] / category_sales['NumOrders']
category_sales['RevenueShare'] = (category_sales['TotalSales'] / category_sales['TotalSales'].sum()) * 100

print("üì¶ Sales by Product Category:")
display(category_sales)

print(f"\nüèÜ Top Category: {category_sales.index[0]} (${category_sales['TotalSales'].iloc[0]:,.2f})")

In [None]:
# Visualization: Sales by Category
fig, ax = plt.subplots(figsize=(10, 6))

colors = ['#3498db', '#2ecc71', '#f39c12', '#e74c3c', '#9b59b6']
bars = ax.bar(category_sales.index, category_sales['TotalSales'], 
              color=colors, edgecolor='black', linewidth=1.5)

ax.set_xlabel('Product Category', fontsize=12, fontweight='bold')
ax.set_ylabel('Total Revenue ($)', fontsize=12, fontweight='bold')
ax.set_title('Revenue by Product Category', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='y', linestyle='--')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# Add value labels
for bar in bars:
    height = bar.get_height()
    ax.text(bar.get_x() + bar.get_width()/2., height,
            f'${height/1000:.0f}K\n({height/category_sales["TotalSales"].sum()*100:.1f}%)',
            ha='center', va='bottom', fontsize=9, fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Top 10 products
top_products = df.groupby('ProductName')['TotalSales'].sum().sort_values(ascending=True).tail(10)

print("üèÜ Top 10 Best-Selling Products:")
for i, (product, sales) in enumerate(top_products.items(), 1):
    print(f"{i:2d}. {product:20s} - ${sales:,.2f}")

In [None]:
# Visualization: Top 10 Products
fig, ax = plt.subplots(figsize=(10, 8))

bars = ax.barh(range(len(top_products)), top_products.values, color='#2ecc71', edgecolor='black')
ax.set_yticks(range(len(top_products)))
ax.set_yticklabels(top_products.index)
ax.set_xlabel('Total Revenue ($)', fontsize=12, fontweight='bold')
ax.set_title('Top 10 Best-Selling Products', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x', linestyle='--')
ax.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# Highlight top product
bars[-1].set_color('#f39c12')

# Add value labels
for i, (bar, value) in enumerate(zip(bars, top_products.values)):
    ax.text(value, i, f'  ${value/1000:.0f}K', va='center', fontsize=9, fontweight='bold')

plt.tight_layout()
plt.show()

## Step 4: Regional Analysis

**Question**: How do sales vary across regions?

In [None]:
# Sales by region
regional_sales = df.groupby('Region').agg({
    'TotalSales': 'sum',
    'OrderID': 'count',
    'Quantity': 'sum'
}).rename(columns={'OrderID': 'NumOrders'}).sort_values('TotalSales', ascending=False)

regional_sales['AvgOrderValue'] = regional_sales['TotalSales'] / regional_sales['NumOrders']

print("üåç Sales by Region:")
display(regional_sales)

print(f"\nüèÜ Top Region: {regional_sales.index[0]} (${regional_sales['TotalSales'].iloc[0]:,.2f})")
print(f"üìä Region with Highest AOV: {regional_sales['AvgOrderValue'].idxmax()} (${regional_sales['AvgOrderValue'].max():,.2f})")

In [None]:
# Visualization: Regional Performance
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(16, 6))

# Total sales by region
colors_region = ['#3498db', '#2ecc71', '#f39c12', '#e74c3c', '#9b59b6']
bars1 = ax1.bar(regional_sales.index, regional_sales['TotalSales'], 
                color=colors_region, edgecolor='black', linewidth=1.5)
ax1.set_xlabel('Region', fontsize=12, fontweight='bold')
ax1.set_ylabel('Total Revenue ($)', fontsize=12, fontweight='bold')
ax1.set_title('Total Revenue by Region', fontsize=13, fontweight='bold')
ax1.grid(True, alpha=0.3, axis='y')
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

for bar in bars1:
    height = bar.get_height()
    ax1.text(bar.get_x() + bar.get_width()/2., height,
             f'${height/1000:.0f}K', ha='center', va='bottom', fontsize=9, fontweight='bold')

# Average order value by region
bars2 = ax2.bar(regional_sales.index, regional_sales['AvgOrderValue'], 
                color=colors_region, edgecolor='black', linewidth=1.5)
ax2.set_xlabel('Region', fontsize=12, fontweight='bold')
ax2.set_ylabel('Average Order Value ($)', fontsize=12, fontweight='bold')
ax2.set_title('Average Order Value by Region', fontsize=13, fontweight='bold')
ax2.grid(True, alpha=0.3, axis='y')
ax2.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:.0f}'))

for bar in bars2:
    height = bar.get_height()
    ax2.text(bar.get_x() + bar.get_width()/2., height,
             f'${height:.0f}', ha='center', va='bottom', fontsize=9, fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Regional category preferences
regional_category = df.groupby(['Region', 'ProductCategory'])['TotalSales'].sum().unstack(fill_value=0)

print("üåç Sales by Region and Category:")
display(regional_category)

# Find top category per region
print("\nüèÜ Top Category per Region:")
for region in regional_category.index:
    top_cat = regional_category.loc[region].idxmax()
    top_sales = regional_category.loc[region].max()
    print(f"{region:10s}: {top_cat:15s} (${top_sales:,.2f})")

In [None]:
# Visualization: Sales distribution by region
fig, ax = plt.subplots(figsize=(12, 6))

sns.boxplot(data=df, x='Region', y='TotalSales', palette='Set2', ax=ax)
ax.set_xlabel('Region', fontsize=12, fontweight='bold')
ax.set_ylabel('Order Value ($)', fontsize=12, fontweight='bold')
ax.set_title('Sales Distribution by Region', fontsize=14, fontweight='bold')
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:.0f}'))

plt.tight_layout()
plt.show()

print("\nüí° Insight: Box plots show the distribution and identify outliers in each region.")

## Step 5: Customer Analysis

**Question**: How do new vs returning customers differ?

In [None]:
# Customer type analysis
customer_analysis = df.groupby('CustomerType').agg({
    'TotalSales': ['sum', 'mean', 'count'],
    'Quantity': 'sum'
})

customer_analysis.columns = ['Total_Revenue', 'Avg_Order_Value', 'Num_Orders', 'Total_Quantity']
customer_analysis['Revenue_Share'] = (customer_analysis['Total_Revenue'] / customer_analysis['Total_Revenue'].sum()) * 100

print("üë• Customer Type Analysis:")
display(customer_analysis)

print(f"\nüí∞ Returning customers spend ${customer_analysis.loc['Returning', 'Avg_Order_Value'] - customer_analysis.loc['New', 'Avg_Order_Value']:.2f} more per order on average")

In [None]:
# Visualization: Customer Type Distribution
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# Pie chart: Customer distribution
colors_customer = ['#3498db', '#2ecc71']
explode = (0.05, 0)
ax1.pie(customer_analysis['Num_Orders'], labels=customer_analysis.index, autopct='%1.1f%%',
        startangle=90, colors=colors_customer, explode=explode, shadow=True)
ax1.set_title('Customer Type Distribution', fontsize=13, fontweight='bold')

# Bar chart: Average order value
bars = ax2.bar(customer_analysis.index, customer_analysis['Avg_Order_Value'],
               color=colors_customer, edgecolor='black', linewidth=1.5)
ax2.set_xlabel('Customer Type', fontsize=12, fontweight='bold')
ax2.set_ylabel('Average Order Value ($)', fontsize=12, fontweight='bold')
ax2.set_title('Average Order Value by Customer Type', fontsize=13, fontweight='bold')
ax2.grid(True, alpha=0.3, axis='y')
ax2.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:.0f}'))

for bar in bars:
    height = bar.get_height()
    ax2.text(bar.get_x() + bar.get_width()/2., height,
             f'${height:.2f}', ha='center', va='bottom', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()

In [None]:
# Payment method analysis
payment_analysis = df.groupby('PaymentMethod').agg({
    'TotalSales': 'sum',
    'OrderID': 'count'
}).rename(columns={'OrderID': 'NumOrders'}).sort_values('TotalSales', ascending=False)

payment_analysis['Share'] = (payment_analysis['NumOrders'] / payment_analysis['NumOrders'].sum()) * 100

print("üí≥ Payment Method Preferences:")
display(payment_analysis)

# Visualization
fig, ax = plt.subplots(figsize=(10, 6))
colors_payment = ['#3498db', '#2ecc71', '#f39c12', '#e74c3c']
bars = ax.barh(payment_analysis.index, payment_analysis['NumOrders'], color=colors_payment, edgecolor='black')
ax.set_xlabel('Number of Orders', fontsize=12, fontweight='bold')
ax.set_title('Orders by Payment Method', fontsize=14, fontweight='bold')
ax.grid(True, alpha=0.3, axis='x')

for i, (bar, value, pct) in enumerate(zip(bars, payment_analysis['NumOrders'], payment_analysis['Share'])):
    ax.text(value, i, f'  {value} ({pct:.1f}%)', va='center', fontsize=10, fontweight='bold')

plt.tight_layout()
plt.show()

## Step 6: Correlation Analysis

**Question**: What relationships exist between variables?

In [None]:
# Scatter plot: Quantity vs Total Sales
fig, ax = plt.subplots(figsize=(10, 6))

for category in df['ProductCategory'].unique():
    data = df[df['ProductCategory'] == category]
    ax.scatter(data['Quantity'], data['TotalSales'], alpha=0.6, s=50, label=category)

ax.set_xlabel('Quantity', fontsize=12, fontweight='bold')
ax.set_ylabel('Total Sales ($)', fontsize=12, fontweight='bold')
ax.set_title('Quantity vs Total Sales by Category', fontsize=14, fontweight='bold')
ax.legend(title='Category', fontsize=9)
ax.grid(True, alpha=0.3)
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:.0f}'))

plt.tight_layout()
plt.show()

print("\nüí° Insight: Electronics and Gaming have higher sales per unit due to higher prices.")

## Step 7: Comprehensive Dashboard

Let's create a professional dashboard with key metrics.

In [None]:
# Create comprehensive dashboard
fig = plt.figure(figsize=(18, 12))

# 1. Monthly trend
ax1 = plt.subplot(3, 3, 1)
ax1.plot(monthly_sales['YearMonth'], monthly_sales['Total_Revenue'], 
         marker='o', linewidth=2, color='#3498db')
ax1.set_title('Monthly Revenue Trend', fontweight='bold', fontsize=11)
ax1.tick_params(axis='x', rotation=45, labelsize=8)
ax1.grid(True, alpha=0.3)
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# 2. Category performance
ax2 = plt.subplot(3, 3, 2)
ax2.bar(category_sales.index, category_sales['TotalSales'], color=colors, edgecolor='black')
ax2.set_title('Revenue by Category', fontweight='bold', fontsize=11)
ax2.tick_params(axis='x', rotation=45, labelsize=8)
ax2.grid(True, alpha=0.3, axis='y')
ax2.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# 3. Regional performance
ax3 = plt.subplot(3, 3, 3)
ax3.bar(regional_sales.index, regional_sales['TotalSales'], color=colors_region, edgecolor='black')
ax3.set_title('Revenue by Region', fontweight='bold', fontsize=11)
ax3.tick_params(axis='x', rotation=45, labelsize=8)
ax3.grid(True, alpha=0.3, axis='y')
ax3.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# 4. Top products
ax4 = plt.subplot(3, 3, 4)
top_5 = df.groupby('ProductName')['TotalSales'].sum().sort_values(ascending=True).tail(5)
ax4.barh(range(len(top_5)), top_5.values, color='#2ecc71', edgecolor='black')
ax4.set_yticks(range(len(top_5)))
ax4.set_yticklabels(top_5.index, fontsize=8)
ax4.set_title('Top 5 Products', fontweight='bold', fontsize=11)
ax4.grid(True, alpha=0.3, axis='x')
ax4.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x/1000:.0f}K'))

# 5. Customer type
ax5 = plt.subplot(3, 3, 5)
ax5.pie(customer_analysis['Num_Orders'], labels=customer_analysis.index, 
        autopct='%1.1f%%', colors=colors_customer, startangle=90)
ax5.set_title('Customer Distribution', fontweight='bold', fontsize=11)

# 6. Payment methods
ax6 = plt.subplot(3, 3, 6)
ax6.bar(payment_analysis.index, payment_analysis['NumOrders'], color=colors_payment, edgecolor='black')
ax6.set_title('Payment Methods', fontweight='bold', fontsize=11)
ax6.tick_params(axis='x', rotation=45, labelsize=8)
ax6.grid(True, alpha=0.3, axis='y')

# 7. Sales distribution
ax7 = plt.subplot(3, 3, 7)
sns.boxplot(data=df, x='ProductCategory', y='TotalSales', palette='Set2', ax=ax7)
ax7.set_title('Sales Distribution by Category', fontweight='bold', fontsize=11)
ax7.tick_params(axis='x', rotation=45, labelsize=8)
ax7.set_xlabel('')
ax7.set_ylabel('Sales ($)', fontsize=9)

# 8. Quantity distribution
ax8 = plt.subplot(3, 3, 8)
sns.histplot(data=df, x='Quantity', bins=15, kde=True, color='#9b59b6', ax=ax8)
ax8.set_title('Quantity Distribution', fontweight='bold', fontsize=11)
ax8.set_xlabel('Quantity', fontsize=9)
ax8.set_ylabel('Frequency', fontsize=9)

# 9. Key metrics text
ax9 = plt.subplot(3, 3, 9)
ax9.axis('off')
metrics_text = f"""
üìä KEY METRICS

Total Revenue:
${df['TotalSales'].sum():,.2f}

Total Orders:
{len(df):,}

Avg Order Value:
${df['TotalSales'].mean():,.2f}

Top Category:
{category_sales.index[0]}

Top Region:
{regional_sales.index[0]}

Returning Customers:
{customer_analysis.loc['Returning', 'Revenue_Share']:.1f}% of revenue
"""
ax9.text(0.1, 0.5, metrics_text, fontsize=11, verticalalignment='center',
         bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))

plt.suptitle('TechMart Sales Performance Dashboard', fontsize=16, fontweight='bold', y=0.995)
plt.tight_layout()
plt.show()

## Step 8: Insights & Recommendations

Based on the analysis, document your key findings and recommendations.

### üìä Key Insights

**[Document your top 3-5 insights here based on the analysis above]**

1. **Sales Trend**:
   - [Your insight about the overall trend]
   - [Supporting data]

2. **Product Performance**:
   - [Your insight about top/bottom categories]
   - [Supporting data]

3. **Regional Patterns**:
   - [Your insight about regional differences]
   - [Supporting data]

4. **Customer Behavior**:
   - [Your insight about customer types]
   - [Supporting data]

5. **Opportunities**:
   - [Your insight about growth opportunities]
   - [Supporting data]

---

### üí° Recommendations

**[Provide 3-5 actionable recommendations]**

1. **[Recommendation 1]**
   - Why: [Reasoning]
   - Expected Impact: [Potential outcome]

2. **[Recommendation 2]**
   - Why: [Reasoning]
   - Expected Impact: [Potential outcome]

3. **[Recommendation 3]**
   - Why: [Reasoning]
   - Expected Impact: [Potential outcome]

---

### üîç Next Steps

**[Suggest additional analyses or actions]**

1. [Next step 1]
2. [Next step 2]
3. [Next step 3]

## üéâ Project Complete!

Congratulations on completing your first mini project!

### What You've Accomplished

‚úÖ Loaded and explored a real-world dataset  
‚úÖ Performed comprehensive sales analysis  
‚úÖ Created professional visualizations  
‚úÖ Identified patterns and trends  
‚úÖ Generated actionable business insights  
‚úÖ Built a complete analytical dashboard  

### Skills Demonstrated

- Data manipulation with Pandas
- Statistical analysis
- Data visualization with Matplotlib and Seaborn
- Business intelligence thinking
- Professional reporting

This project is **portfolio-ready**! Consider adding it to your GitHub or personal website.

### Next Project

Tomorrow you'll work on a **Data Cleaning Project** where you'll tackle messy, real-world data!