# Online Retail Dataset - Exploratory Data Analysis

**Author:** Shaik Sohail

This notebook performs Exploratory Data Analysis (EDA) on the Online Retail dataset. It covers data cleaning, visualization, and business insights.

In [None]:

# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Set styles
sns.set(style="whitegrid")


## Load Dataset

In [None]:

# Load the dataset
df = pd.read_csv("OnlineRetail.csv", encoding="ISO-8859-1")
df.head()


## Data Inspection

In [None]:

# Basic info
df.info()
df.describe()


## Data Cleaning

In [None]:

# Drop rows with missing CustomerID
df = df.dropna(subset=["CustomerID"])

# Remove negative or zero Quantity and UnitPrice
df = df[(df["Quantity"] > 0) & (df["UnitPrice"] > 0)]

# Convert InvoiceDate to datetime
df["InvoiceDate"] = pd.to_datetime(df["InvoiceDate"])

# Add TotalPrice column
df["TotalPrice"] = df["Quantity"] * df["UnitPrice"]

df.head()


## Top 10 Products by Quantity Sold

In [None]:

top_products = df.groupby("Description")["Quantity"].sum().sort_values(ascending=False).head(10)
plt.figure(figsize=(10,6))
sns.barplot(x=top_products.values, y=top_products.index, palette="viridis")
plt.title("Top 10 Products by Quantity Sold")
plt.xlabel("Quantity Sold")
plt.ylabel("Product Description")
plt.show()


## Top 10 Countries by Revenue

In [None]:

top_countries = df.groupby("Country")["TotalPrice"].sum().sort_values(ascending=False).head(10)
plt.figure(figsize=(10,6))
sns.barplot(x=top_countries.values, y=top_countries.index, palette="coolwarm")
plt.title("Top 10 Countries by Revenue")
plt.xlabel("Revenue")
plt.ylabel("Country")
plt.show()


## Monthly Revenue Trend

In [None]:

df["Month"] = df["InvoiceDate"].dt.to_period("M")
monthly_revenue = df.groupby("Month")["TotalPrice"].sum()

plt.figure(figsize=(12,6))
monthly_revenue.plot(kind="line", marker="o")
plt.title("Monthly Revenue Trend")
plt.xlabel("Month")
plt.ylabel("Revenue")
plt.show()


## Business Insights


- United Kingdom is the largest market, followed by other European countries.  
- A small number of products contribute to the majority of sales.  
- Certain inexpensive products are sold in very high volumes.  
- Monthly trends reveal seasonality in purchasing patterns.  


## Conclusion

This analysis highlights key products, customers, and countries that drive revenue. Businesses can leverage these insights to optimize inventory, marketing, and sales strategies.