
# Sales Trends & Performance Dashboard: Chocolate Sales Analysis

## Overview
This project analyzes chocolate sales data to uncover product performance, regional trends, and opportunities for optimization. Raw sales data was cleaned and analyzed using Python (Pandas and NumPy), and prepared for visualization in Tableau.



## Objectives
- Clean and preprocess chocolate sales data for analysis.
- Analyze monthly trends, product performance, and regional sales.
- Evaluate salesperson contribution and identify outliers.
- Export data for dashboard visualization in Tableau.



## Tools & Technologies
- **Python**: Pandas, NumPy
- **Tableau**: Visualization and dashboard creation

In [1]:

import pandas as pd
import numpy as np

In [2]:

df = pd.read_csv("Chocolate Sales.csv")
df.columns = df.columns.str.strip().str.lower().str.replace(" ", "_")
df['amount'] = df['amount'].replace(r'[\$,]', '', regex=True).astype(float)
df['date'] = pd.to_datetime(df['date'], format='%d-%b-%y', errors='coerce')

In [3]:

df['z_score_amount'] = (df['amount'] - np.mean(df['amount'])) / np.std(df['amount'])
outliers = df[np.abs(df['z_score_amount']) > 3]
summary_stats = {
    'amount': {
        'mean': np.mean(df['amount']),
        'median': np.median(df['amount']),
        'std_dev': np.std(df['amount']),
        'range': np.ptp(df['amount'])
    }
}
summary_stats

{'amount': {'mean': np.float64(5652.308043875685),
  'median': np.float64(4868.5),
  'std_dev': np.float64(4100.566611892091),
  'range': np.float64(22043.0)}}

In [4]:

df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year

In [5]:

monthly_sales = df.groupby(['year', 'month'])['amount'].sum().reset_index().sort_values(['year', 'month'])
top_products = df.groupby('product')['amount'].sum().sort_values(ascending=False).reset_index()
sales_by_country = df.groupby('country')['amount'].sum().sort_values(ascending=False).reset_index()
sales_by_person = df.groupby('sales_person')['amount'].sum().sort_values(ascending=False).reset_index()
aov_country = df.groupby('country')['amount'].mean().sort_values(ascending=False).reset_index().rename(columns={'amount': 'average_order_value'})
correlation = df[['boxes_shipped', 'amount']].corr().iloc[0, 1]

In [6]:

df.to_csv("cleaned_chocolate_sales.csv", index=False)


## Key Insights

- **Peak Sales Months:** January, June, and July 2022 were the top-performing months, with January alone generating over $896K in chocolate sales. This indicates that there was a strong post-holiday or New Year demand. 

- **Best-Selling Products:** The top three revenue-generating products were:
  - *Smooth Sliky Salty* — $349,692
  - *50% Dark Bites* — $341,712
  - *White Choc* — $329,147  
  These items may represent core offerings driving overall business success.

- **Top Markets:** Australia led all regions with $1.14M in sales, followed by the UK and India, highlighting key markets with strong demand potential.

- **Boxes vs. Revenue:** There is no linear correlation between the number of boxes shipped and total sales amount (correlation = **-0.0188**). Thi suggests that higher-priced products or bundle deals skew the relationship.



## Tableau Dashboard
[View Tableau Dashboard](https://public.tableau.com/app/profile/ein.cagle/viz/GlobalChocolateSalesDashboard_17443257086610/GlobalChocolateSalesDashboard)