# Online Retail - E-commerce Analysis & Dashboard

## Introduction and Problem Statement

### Project Objective
Analyzing e-commerce data to understand sales trends, customer behavior, and segmentation. The project integrates EDA, Machine Learning, and Power BI to provide valuable business insights.

### Dataset Description
Source: UCI Machine Learning Repository (https://archive.ics.uci.edu/ml/datasets/online+retail)

Period: December 1, 2010 - December 9, 2011

Attributes:

InvoiceNo - Invoice number

StockCode - Product code

Description - Product description

Quantity - Number of units sold

InvoiceDate - Purchase date

UnitPrice - Unit price

CustomerID - Customer identifier

Country - Customer's country


### Analysis Plan

Data loading and optimization

Exploratory Data Analysis (EDA)

Customer segmentation (RFM analysis)

Sales forecasting (ML)

Data preparation for Power BI

Dashboard creation in Power BI

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Data Loading and Optimization

In [3]:

df = pd.read_excel("Online Retail.xlsx")
df.to_csv("Online_Retail.csv", index=False)  


In [4]:
dtype_dict = {
    "InvoiceNo": "category",
    "StockCode": "category",
    "Description": "string",
    "Quantity": "int16",
    "UnitPrice": "float32",
    "CustomerID": "float32",
    "Country": "category"
}
df = pd.read_csv("Online_Retail.csv", dtype=dtype_dict, parse_dates=["InvoiceDate"])


In [None]:
df.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom


### Exploratory Data Analysis (EDA)

In [7]:
df.isnull().sum()

InvoiceNo           0
StockCode           0
Description      1454
Quantity            0
InvoiceDate         0
UnitPrice           0
CustomerID     135080
Country             0
dtype: int64

In [8]:
df.describe()

Unnamed: 0,Quantity,InvoiceDate,UnitPrice,CustomerID
count,541909.0,541909,541909.0,406829.0
mean,9.55225,2011-07-04 13:34:57.156386048,4.611114,15287.689453
min,-15459.0,2010-12-01 08:26:00,-11062.05957,12346.0
25%,1.0,2011-03-28 11:34:00,1.25,13953.0
50%,3.0,2011-07-19 17:17:00,2.08,15152.0
75%,10.0,2011-10-19 11:27:00,4.13,16791.0
max,15459.0,2011-12-09 12:50:00,38970.0,18287.0
std,64.654892,,96.759857,1713.600342
