# 1.0  Project Overview
This project leverages RFM analysis (Recency, Frequency, Monetary value) to unlock valuable insights into their behavior, preferences, and purchasing patterns for an automobile company.  The goal? To drive targeted marketing, optimize inventory management, and ultimately make informed business decisions that fuel growth and profitability.

Beyond just understanding customer behavior, The aim is to segment them into distinct groups based on their unique characteristics. This segmentation, informed by RFM scores and enriched with demographic insights like age and gender, would allow to tailor marketing, optimize product offerings, and ultimately transform customer relationships into valuable drivers of success.

Here's how this is achieved:
 
■ RFM Analysis: By analyzing purchase data through the with Recency, Frequency, and Monetary value, customer segments with distinct purchase behaviors were identified. This allows to understand who are the most loyal, high-spending, or recently engaged customers, enabling targeted marketing and retention strategies.

■ Profitable Product Identification: Customer behavior is used to pinpoint the most profitable products within the automobile market. This data informs inventory management and strategic marketing, ensuring resources are focused on products with the highest potential return.

■ Demographic Insights: Integrating customer demographics like age and gender into the analysis adds a crucial layer of understanding. This unveils age-specific preferences and purchase patterns, further refining the segmentation and product targeting strategies.

In essence, this project goes beyond simply understanding customer behavior. The aim was to unlock the full potential of customer data by segmenting them into meaningful categories and tailoring business decisions to their specific needs and preferences. This empowers the tailoring of business decisions for maximum impact, driving growth and customer satisfaction.


# 2.0 Data Preparation and Cleaning
The initial step involves thorough data preparation and cleaning. The following datasets were evaluated and processed to ensure data accuracy and consistency:

■ CustomerDemographics.xlsx: Removed irrelevant columns. Addressed missing values by imputing appropriate values. Standardized gender column data for consistency. Transformed the Date of Birth column to create 'Age' and 'Age Group,' identifying and handling outliers. 

■ NewCustomerList.xlsx: Eliminated irrelevant columns. Handled missing values similar to the CustomerDemographics dataset. Transformed the Date of Birth column to align with the existing structure. 

■ Transaction_data.xlsx: Converted the 'product_first_sold_date' column to datetime format. Managed missing values and created a new 'Profit' feature based on list price and standard price. Checked for duplicate records. 

■ CustomerAddress.xlsx: Standardized states column data for consistency. Addressed discrepancies where certain customer IDs from CustomerDemographics were dropped in the Address table. 

# 3.0 Exploratory Data Analysis:

Following data cleaning, exploratory analysis was conducted to derive insights into customer segments.

# 3.1 New vs Old Customers Age Distribution: 

![image-2.png](attachment:image-2.png)

Fig 3.1.1 - New and Old Customers Age Distribution

■ Most new customers are around the age groups of 40-49 and 50-59, suggesting that these are the most common age ranges (peak) for new customers.

■ The distribution is somewhat skewed to the right, with a tail extending towards older age groups. This indicates that while the majority of new customers are in their 40s and 50s, there's also a notable segment of older individuals.

# 3.2 New vs Old Customers Job Industry Distribution

Examined job industry distribution among new and old customers including Financial Services, Manufacturing, Health, Sales, Retail, Property, IT, Entertainment, Agriculture and Telecommunications. 

![image-2.png](attachment:image-2.png)
Fig 3.2.1 New and Old Customers Job Industry Distribution


■ Old Customers: Sales leads with 3962 old customers for the Manufacturing, followed by the  Financial Services (3803), Health (3048), Retail (1746), Property (1286), IT (707), Entertainment (694),  Agriculture (566), and Telecommunications (352). Agriculture and Telecommunications have the lowest counts.

■ New Customers: Financial Services has the highest number of new customers (202), followed by Manufacturing (199), Health (152), Retail (78), Property (64), Entertainment (36), IT (36), Agriculture (26), and Telecommunications (25). Again, Agriculture and Telecommunications have the lowest new customer counts.

# 3.2.1 Insights
The company has a loyal customer base in the Manufacturing, Financial Services, and Health sectors, which are typically large and stable industries. This suggests that the company offers products or services that are reliable and consistent in these markets. The focus should be on attracting and retaining new customers.

The company has also faced a decline in customer acquisition in all sectors, which could indicate that the company has lost its competitive edge, faced increased competition, or faced external challenges such as the pandemic or the economic downturn.
The company has a relatively low customer presence in the IT, Entertainment, Agribusiness, and Telecommunications sectors, which are also potentially innovative and dynamic industries. This could imply that the company has missed out on the opportunities and trends in these markets, or that the company has not adapted its offerings to the changing customer needs and preferences in these sectors.

The company has a moderate customer presence in the Retail and Property sectors, which are also important and diverse industries. The company could leverage its existing customer relationships to retain and grow its customer base in these sectors, or explore new ways to differentiate its products or services from its competitors.

# 3.3 Wealth Segmentation by Age Category
Explored wealth segmentation across different age categories. Observed the distribution of customers based on their wealth and age.

![image.png](attachment:image.png)
  Fig 3.3.1 - Wealth Segmentation by Age Category

■ New Customers: The overall distribution of customers skews towards older age groups, with the highest concentration in the 50-59 age group (103 customers). The number of customers steadily decreases from 50-59 onwards, except for a slight uptick in the 80-90 age group (45 customers). 
The highest number of new customers belongs to the Mass Customer segment across all age groups, followed by Affluent Customer and High Net Worth.
The number of new customers decreases as the age group increases, except for the 45-54 age group, which has a slight increase compared to the 35-44 age group.
The proportion of Affluent Customer and High Net Worth customers among the new customers is higher in older age groups than in younger age groups.
The Mass segment accounts for approximately 43% of new customers, the High Net Worth segment for 22%, and the Affluent segment for 21%.

■Old Customers: A significant number of old customers belong to the Mass Customer segment in all age groups, but especially in the more than 54 age group, where the count reaches up to around 3200 customers.
The number of old customers increases as the age group increases, except for the less than 25 age group, which has a very low number of customers.
The proportion of Affluent Customer and High Net Worth customers among the old customers is lower in older age groups than in younger age groups.

■ Comparison: The new customers have a more balanced distribution of wealth segments than the old customers, who are mostly Mass Customers.
The new customers have a more diverse range of age groups than the old customers, who are mostly older than 54 years old.
The new customers have a higher proportion of Affluent Customer and High Net Worth customers in older age groups than the old customers, who have a higher proportion of Mass Customers in older age groups.

■ The Mass segment is a key driver of customer acquisition. Resource allocation and target market strategies should prioritize this segment while also considering opportunities within the High Net Worth segment.While the Mass segment currently dominates, there might be opportunities to attract more customers from the Affluent and High Net Worth segments with tailored strategies.

![image-3.png](attachment:image-3.png)

   Fig 3.3.2 - Total new and old customers by wealth

# 3.4 Top Ten Bicycle Brands

The total revenue for bicycles is $10.68 billion. The top five brands collectively account for over 80% of the total revenue.

![image-2.png](attachment:image-2.png)
 Fig 3.4.1 - The Top Five Brand


![image-4.png](attachment:image-4.png)
 Fig 3.4.2 - Top Brands by Average Profit

■ WeareA2B is the leading brand with revenue of $2.68 billion. WeareA2B has a dominant market share of 36.5%.

■ Solex is the second-place brand with revenue of $2.36 billion . Solex has a market share of 14.7%.

■ Trek Bicycles is the third-place brand with revenue of $1.80 billion. Trek Bicycles has a market share of 11.2%.

■ Giant Bicycles is the fourth-place brand with revenue of $1.54 billion. Giant Bicycles has a market share of 9.5%.

■ OHM Cycles is the fifth-place brand with revenue of $1.45 billion. OHM Cycles has a market share of 8.9%.

■ Norco Bicycles is the sixth-place brand with revenue of $841 million. Norco Bicycles has a market share of 5.2%.

■ WeareA2B is the most profitable brand in the bicycle market, with an average profit of $833 per bicycle. WeareA2B's high profitability is likely due to its strong brand recognition and its ability to command premium prices for its bicycles.
The average profit margin for bicycle brands is 29%. This is a relatively high profit margin, which suggests that the bicycle industry is a profitable one.

# 3.5 Profit Distribution
The average mean profit is  $105,588, but there is a large standard deviation of $88,844, indicating that there is a significant amount of variability in profit across products. The median profit is around $97,000, which is lower than the average mean profit, indicating that there are a few high-profit products that are driving up the mean.

![image.png](attachment:image.png)
Fig 3.5.1 - Product Analysis

The 25th percentile profit is $34,777, and the 75th percentile profit is $155,686. This means that half of the products have profits between $34,777 and $155,686. The most profitable product has a profit of $587,380, while the least profitable product has a profit of $4,782.

The distribution of profit is skewed to the right, meaning there are more products with lower profits than there are products with higher profits.
![image-2.png](attachment:image-2.png)
Fig 3.5.2 - Profit Distribution


■  There is a weak positive correlation between price and profit. This means that products with higher prices tend to have higher profits, but there are also many exceptions to this rule.
For example, there are some products with high prices that have low profits, and there are some products with low prices that have high profits.

# 3.6 Bike Purchases in the Last 3 Years by Gender

![image-2.png](attachment:image-2.png)

Fig 3.6.1 - Bike Purchases in the Last 3 Years by Gender

![image.png](attachment:image.png)
Fig 3.6.2 - Bike Purchases in the Last 3 Years by Gender Visualization

The total number of bike-related purchases in the last 3 years is fairly close between genders, with 478,488 purchases by females and 468,943 by males. This suggests a relatively even distribution of bike buying behavior across genders.

While the numbers are very close, females make up a marginally higher percentage of bike purchasers, with 50.5% of the total compared to 49.5% for males.

# 3.7 Cars Owned by States
The number of customers in New South Wales(NSW) is the highest, followed by Victoria (VIC) and Queensland (QLD). 

![image-2.png](attachment:image-2.png)
Fig 3.7.1 - Cars Owned by States

In NSW, there are more customers who own cars than those who do not. In QLD and VIC, the number of customers who own cars is almost the same as those who do not.

![image.png](attachment:image.png)
Fig 3.7.2 - Cars Owned by States Visualization


# 3.8 Understanding Customer Behavior - with RFM Analysis and Customer Segmentation

An RFM analysis, leveraging Recency, Frequency, and Monetary value, provided a behavior-based approach to segmenting customers based on their past purchase history. 

This analysis revealed 11 distinct customer segments:
Top Tier Customer
Engaged Customer
Emerging Loyal Customer
Recent Customer
Prospective Customer
Delayed Adopter
Drifting Customer
High Risk Customer
Slipping Customer
Inactive Customer
Churned Customer

![image-2.png](attachment:image-2.png)
Fig 3.8.1 - Customer Segmentation

■ High-Value Segments:
Top Tier Customers: These customers exhibit the most recent purchases, frequent interactions, and highest spending, solidifying their top position.
Engaged Customers: Demonstrating consistent engagement with a strong purchase history, these customers are highly valuable.

■ Promising Segments:
Emerging Loyal Customers: Their increasing purchase frequency suggests they're on the path to becoming loyal customers.
Recent Customers: Newcomers with recent purchases show potential for long-term engagement.
Prospective Customers: While lacking recent purchases, they exhibit characteristics indicating future potential and warrant strategic outreach to convert them.

■ At-Risk Segments:
Delayed Adopters: While possessing potential, these customers haven't made recent purchases, requiring re-engagement efforts.
Drifting Customers: Decreasing purchase frequency or value indicates potential churn, necessitating attention.
High Risk Customers: Despite high spending, their infrequent transactions pose a potential churn risk, demanding proactive measures.

■ Other Segments:
Slipping Customers: Very recent purchases offer a last chance to engage before potential churn.
Inactive Customers: These customers haven't purchased recently, requiring reactivation efforts.
Churned Customers: Formerly active customers who haven't purchased in a long time might be lost, understanding their reasons for churn can inform future retention strategies, targeted campaigns could win them back.

This segmentation allows for targeted marketing, customized retention strategies, and data-driven business decisions tailored to each customer group's distinct characteristics and behaviors. 

# 3.8.1 RFM ANALYSIS - Frequency VS Recency

![image.png](attachment:image.png)

Fig 3.8.1.1 - RFM ANALYSIS - Frequency VS Recency

The visualization segments customers into 11 categories based on their RFM score, ranging from "Top Tier Customer" (highest score) to "Churned Customer" (lowest score).

■ The size of the data points represents the RFM score, with larger points indicating higher scores and potentially more valuable customers.

■ The color gradient from blue to green to red represents the monetary value, with red indicating higher spending customers.

Insights:
■ Top right quadrant: This area concentrates "Top Tier" and "Engaged" customers, characterized by high recency, frequency, and monetary value. These are the most valuable customers and should be prioritized for retention and upselling efforts.

■ Bottom right quadrant: This area shows "Emerging Loyal" and "Recent" customers, who have high monetary value but lower recency or frequency. They might be new customers or at risk of churn, so it's important to engage them with targeted marketing campaigns.

■ Top left quadrant: This area contains "Prospective" and "Delayed Adopter" customers, who have shown some interest (recent purchases) but haven't spent much yet. They might be potential high-value customers, so consider offering them incentives or personalized recommendations.

■ Bottom left quadrant: This area represents "Drifting," "High Risk," "Slipping," and "Inactive" customers, indicating low engagement and potential churn. Implement a targeted win-back campaigns or loyalty programs can re-engage them.

# 3.8.2 RFM ANALYSIS - Monetary VS Recency


![image.png](attachment:image.png)
Fig 3.8.2.1 - RFM ANALYSIS - Frequency VS Recency

■ Customers with higher RFM scores tend to have higher monetary value and more recent purchases. This suggests that these customers are more engaged with the business and are more likely to be profitable.

■ There is a positive correlation between recency and monetary value. This means that customers who have made purchases more recently tend to spend more money. This could be because they are still in the early stages of their customer lifecycle and are more likely to be trying out new products or services.

■ There are a few data points that are outliers. These outliers represent customers who have made very large purchases or who have not made a purchase in a long time.

# 3.9 TRENDS

# 3.9.1 MONTHLY TRENDS

![image.png](attachment:image.png)

The highest monthly revenue of approximately 960000 was recorded in October (2017-10), while the lowest monthly revenue of approximately 840000 was recorded in March (2017-03).

Fig 3.9.1 - Monthly Trends


# 3.9.2 QUARTERLY TRENDS

![image.png](attachment:image.png)
Fig 3.9.2 - Monthly Trends

■ Stability in Q1 and Q2: The revenue remained relatively stable during the first half of the year (Q1 and Q2).

■ Sharp Increase from Q2 to Q4: From Q2 onwards, there is a noticeable upward trend in revenue.

■ Peak in Q4: The highest revenue, close to 2.74 million, was recorded in October (2017-10).

■ The quarterly trends reinforce the observation of substantial revenue growth during the latter half of the year.

The upward trend in both monthly and quarterly revenue indicates positive business performance.
■ Resource Allocation: Allocate resources strategically during peak revenue periods (e.g., Q4).

■ Pricing Strategies: Consider adjusting pricing strategies based on revenue fluctuations.

■ Business Planning: Use these insights for business planning and decision-making.

# 10.0  Conclusion

By leveraging RFM analysis, demographic insights, and profitability assessments, the recommendations outlined above aim to guide the automobile company towards optimized marketing, enhanced customer relationships, and strategic decision-making. Continuous adaptation to market dynamics and a data-driven approach will be key to achieving sustained growth and profitability.