#  Product Analysis

This analysis focuses on identifying the most popular products based on quantity sold. Understanding product popularity helps optimize inventory management, marketing efforts, and sales strategies.

---

## What We Will Explore:

### 1. Product Popularity Metrics
- **Total Quantity Sold**: Sum of all units sold per product.
- **Top-Selling Products**: Products with the highest sales volume.

---

### 2. Import Cleaned Data from ETL Pipeline

We use the cleaned dataset (`df_cleaned`) that was prepared in the ETL pipeline. The following steps were already applied:
- Removed rows with missing `CustomerID` or `Description`.
- Removed cancelled orders (Invoices starting with 'C').
- Excluded records with negative or zero `Quantity` or `UnitPrice`.
- Converted `InvoiceDate` to datetime format.
- Created new columns: `TotalPrice`, `Year`, `Month`, `Day`, `Hour`, `Weekday`.
- Renamed `Description` to `ProductName`.
- Cleaned `ProductName` by stripping whitespace and converting to lowercase.
- Removed duplicate records.

These preprocessing steps ensure we analyze clean, accurate, and consistent data.

---


## 3. Visualizations

To effectively understand product trends and identify bestsellers, we will use:

- **Seaborn**: A horizontal bar chart that displays the top 10 products ranked by the total quantity sold, using color to enhance readability and clarity.



## Importing Libraries

Before we begin our product analysis, we need to import the essential Python libraries that will help us load, process, and visualize the data.

### Libraries Used:

- **pandas**: For data manipulation and analysis (e.g., grouping products, aggregating sales).
- **numpy**: For numerical operations and handling arrays.
- **matplotlib.pyplot**: A core Python library for creating static visualizations like bar charts.
- **seaborn**: A higher-level visualization library built on top of matplotlib, useful for creating aesthetically pleasing charts such as horizontal bar plots.

### Display Settings:

- `%matplotlib inline`: Ensures that Matplotlib charts appear directly within the notebook cells.
- `plt.style.use('ggplot')`: Applies a clean and visually appealing style to Matplotlib plots.
- `sns.set(style='whitegrid')`: Sets the background style of Seaborn plots to show light gridlines for better readability.

These libraries will provide the necessary tools for data aggregation, cleaning, and visual storytelling throughout this product analysis.


# Importing Libraries for Product Analysis
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Display settings for plots
%matplotlib inline
plt.style.use('ggplot')
sns.set(style='whitegrid')
