Sales Data Analysis Project 📊 Project Overview This project performs a comprehensive analysis of six years of pharmaceutical sales data. Using Python, it involves cleaning the raw dataset, exploring the data to uncover trends and patterns, and visualizing insights to support data-driven decision-making.
🎯 Objectives Clean and preprocess the sales dataset for analysis
Perform Exploratory Data Analysis (EDA) to understand sales trends over time
Identify correlations and seasonal patterns in pharmaceutical product sales
Visualize key findings using charts and graphs for better interpretation
📁 Dataset The dataset contains sales data spanning six years, including:
Product category-level sales volumes
Time-based columns (Year, Month, Weekday, Hour)
🛠 Tools & Technologies Python: pandas, matplotlib, seaborn
Editor: Thonny (Python IDE)
Excel: for quick initial inspection
🗂 Project Structure 02_data_clean/: Cleaned version of the original dataset
03_notebooks/: EDA notebooks with visualizations
04_outputs/: Final outputs, charts, and summary insights
analysis.py: Core script for analysis
README_draft.txt: Draft notes and planning
sales_data.csv: Main dataset
📌 Key Findings Seasonal spikes in sales during specific months and weekdays
Strong correlation between product categories and time-based features
Patterns that can inform inventory and marketing strategies
Install required packages:
bash Copy Edit pip install pandas matplotlib seaborn Run the notebook or script files to reproduce the analysis and visualizations