Skip to content

santrupt-shekhar-2004/RFM-ANALYSIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RFM Analysis

GOAL OF PROJECT:

The purpose of this project is to conduct a Customer Segmentation Analysis for an Automobile bike Company. Customer segmentation is performed by developing a RFM Model. RFM (Recency, Frequency, Monetary) analysis is a behavior-based approach grouping customers into segments. It groups the customers on the basis of their previous purchase transactions. In this analysis the customer segment was divided into 11 groups. The analysis will help in determining which customers segments should be targeted in order to enhance sales revenue for the company.The data quality assessment and analysis is done using Python. This multi-faceted approach combines the power of data visualization with advanced analytics to offer a holistic view of customer behavior and preferences. By leveraging the insights derived from the RFM Model, the company can make informed decisions on resource allocation and tailor marketing campaigns to resonate with specific customer segments. Ultimately, this analysis serves as a strategic tool for enhancing overall business performance and maximizing sales potential within the competitive automotive bike market.

Data Quality Assessment and Data Cleaning

In the data cleaning step the data quality of the following datasets were first assesed. After a data quality assessment the following data quality issues was observed and the necessary process to mitigate the issue was followed :

1) CustomerDemographics.xlsx :

  • Irrelevent column was present and such columns were dropped from the dataset.
  • There were 5 columns were Missing values were present. Such columns were dropped.
  • For gender column there was no standardisation of data. Based on the values available the column data was standardised to remove data inconsistency.
  • The Date of Birth column was transformed to create a new feature column 'Age' and 'Age Group' to check for discripency of age distribution. An outlier was observed and the record was removed. *Checked whether there are duplicate records present in the dataset. In this dataset there were no duplicate records.

2)NewCustomerList.xlsx :

  • 5 Irrelevent column was present and such columns were dropped from the dataset.
  • There were 4 columns were Missing values were present. For such columns based on the volumne of the missing values either the records were dropped or appropiate values were imputed at places of missing values
  • The Date of Birth column was transformed to create a new feature column 'Age' and 'Age Group' to check for discripency of age distribution.
  • There was no data inconsistency.

3)Transaction_data.xlsx :

  • The product_first_sold_date column is not in datetime format. The data type of this column was changed from int64 to datetime format.
  • There were 7 columns were Missing values were present. For such columns based on the volumne of the missing values either the records were dropped or appropiate values were imputed at places of missing values
  • A new feature column 'Profit' was created which is basically the difference between list price and standard price.

4)CustomerAddress.xlsx :

  • For states column there was no standardisation of data. Based on the values available the column data was standardised to remove data inconsistency.
  • There were certain customer IDs from Customer Dempgraphics table which were getting dropped in the Address table.

Exploratory Data Analysis

1)New vs Old Customers Age Distribution

Image 11-02-24 at 12 52 PM (1)Image 11-02-24 at 12 52 PM

  • Most New customers are aged between 40-49 also for Old Customers the most of them are aged between 50-59
  • The lowest number of customers for both the types of customers is present in the age bracket under 20 and above 80 age groups.

2)Bike purchases over last 3 years by Gender

Image 11-02-24 at 1 07 PM

  • Most bike puechases are done by Feamale over the last 3 years. Approximately 51% of the bike purchases are done by Female compared to 49% of the purchases being done by Male.

3)New vs Old Customers Job Industry Distribution

Image 11-02-24 at 1 11 PMImage 11-02-24 at 1 11 PM (1)

  • Most New customers are from the Manufacturing and Financial Services sector. Simillar trend is observed under old customers.
  • The lowest number of customers are from the Agriculture and Telecom sector

4)Wealth Segmentation by Age Category

Image 11-02-24 at 1 20 PM Image 11-02-24 at 1 20 PM (1)

  • Across all age categories the largest number of customers are from 'Mass Customer' Segment

5)Cars owned by States

Image 11-02-24 at 1 23 PM

  • New South Wales has the largest number of people who donot own a car.

Customer Segmentation

The RFM (Recency, Frequency, Monetary) analysis is a behavior-based approach grouping customers into segments. It groups the customers on the basis of their previous purchase transactions. In this analysis the customer segment was divided into 11 groups :

  • Platinum Customers
  • Very Loyal Customers
  • Recent Customers *Potential Customers
  • Lost Customers
  • Losing Customers
  • Late Bloomer
  • High Risk Customers
  • Evasive Customers
  • Becoming Loyal
  • Almost lost Customers Image 11-02-24 at 1 33 PM

RFM Analysis: Scatter Plots

Recency vs Monetary :

The visualization shows that recent customers have purchased more products and generated relatively more revenue.

Image 11-02-24 at 1 42 PM

Frequency vs Monetary :

The visualization shows that customers belonging to Platinum, Very Loyal and Becoming Loyal Customer Segments have a greater frequency thus generate greater monetary. Image 11-02-24 at 1 48 PM

Documentation

Documentation

Hi, I'm Santrupt Shekhar! 👋

🛠 Skills

Python, C++, Data Analytics, Web Scraping

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published