Retail Customer Behaviour Analysis

This project Details the analysis conducted to Identify customer behaviour in an ecommerce data set. The project showcases the adoption of Recency, frequency and spend in understanding customer behaviour in the data set

Customer segementation was also identified using KMeans Clustering algorithm.

Customer Behaviour analysis and segementation using KMeans.

Project description: This is a data analysis project that involves reading, cleaning, exploring, and performing advanced analyses on a retail dataset. The project uses Kmeans Clustering algorithm for identifying .

This project covers data cleaning, exploratory data analysis (EDA), data visualization, and customer segmentation using machine learning techniques.

1. Importing libraries and Loading the data

I used a dataset named OnlineRetail.csv, which contains transactional data from an online retail store. The primary goal was to explore the dataset, clean it, visualize important patterns, and perform customer segmentation \

2. Data Visualization

After removing the nulll values from the dataset, I created two addditonal columsn 'Month' and 'Day of Week' based of tdh invoice date column. The goal was to further understand if there were any pointers to the actual day of week, or some other insights from the date Below are some visualizatiions showing trend and distributions.

The Barplots show that there were no transactions on saturday, and even though the highest transactions were in the 11th month, the total amount spent was highest in the first month in January

In addition this visuals show for teh top 5 countries and the tiop stock items purchased

3.Analysis using Recency Frequency and Spend (RFS)

Recency, Frequency, Monetary model (RFM), is a behavior based analysis technique used to segment customers by examining their transaction history. Recency is calculated as the number of days since the last purchase Recency is calculated, frequency is the number of transactions per customer, and Spend is the total amount spent per customer.

last_transaction_date = df.groupby('CustomerID')['InvoiceDate'].max()
reference_date = max(df['InvoiceDate'])
days_difference = (reference_date - last_transaction_date).dt.days
days_difference = days_difference.reset_index().rename(columns={'InvoiceDate': 'recency'})

These 2 metrics are then merged into a single dataframe.

And these are boxplots to show the Outliers in the RFS data distributions

The outliers were mostly in the Frequency and spend coluumns and were removed before applying Kmeans algorthm

4. Feature scaling

The outliers were removed in X, and then feature scaling

from sklearn.preprocessing import StandardScaler
X=rfs.iloc[:,1:]
scaler = StandardScaler()
X = scaler.fit_transform(X)

4. Clustering customers using KMeans

I adopted the Yellowbirck cluster for visualizing the Kmeans distorion score Elbow. As shown below, the k elbows at 4, indicating 4 clusters.

Then I fitted the Kmeans and then updated the RFS dataframe with teh clusters identified.

kmeans= KMeans(n_clusters=4,n_init='auto',random_state=42)
kmeans.fit(X)

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
MallCustomer_Segmentation.ipynb		MallCustomer_Segmentation.ipynb
Mall_Customers.csv		Mall_Customers.csv
OnlineRetail.csv		OnlineRetail.csv
OnlineSales.ipynb		OnlineSales.ipynb
README.md		README.md
customersegment.py		customersegment.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retail Customer Behaviour Analysis

Customer Behaviour analysis and segementation using KMeans.

1. Importing libraries and Loading the data

2. Data Visualization

3.Analysis using Recency Frequency and Spend (RFS)

4. Feature scaling

4. Clustering customers using KMeans

4. Results and findings : recency and Spend

Cluster 0: High recency, low frequency , low Spend

Cluster 1: Low recency, High Frequency, moderate Spend

Cluster 2: Low Recency, Low Frequency, Low Spend

Cluster 3: Low Recency, High Frequency, High Spend

About

Releases

Packages

Languages

moyinajayi/Retail_customer_Clustering

Folders and files

Latest commit

History

Repository files navigation

Retail Customer Behaviour Analysis

Customer Behaviour analysis and segementation using KMeans.

1. Importing libraries and Loading the data

2. Data Visualization

3.Analysis using Recency Frequency and Spend (RFS)

4. Feature scaling

4. Clustering customers using KMeans

4. Results and findings : recency and Spend

Cluster 0: High recency, low frequency , low Spend

Cluster 1: Low recency, High Frequency, moderate Spend

Cluster 2: Low Recency, Low Frequency, Low Spend

Cluster 3: Low Recency, High Frequency, High Spend

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages