Skip to content

The aim of this project is to analyze the spending behavior of customer groups using various techniques.

Notifications You must be signed in to change notification settings

asumandemireriden/mall_customer_analysis_Rprogramming

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

mall_customer_analysis_Rprogramming

The aim of this project is to analyze the spending behavior of customer groups using various techniques.

Hypothesis

Hypothesis 1 : Female customers have a higher spending score than male customers.

  • Aim 1: Analyze the average spending score by creating a box plot for both genders by grouping the data by gender
  • Aim 2: Conduct a statistical test (Two Sample T-test) to determine if the difference in spending scores between the genders is statistically significant.

Hypothesis 2: There is a positive relationship between annual income and spending score.

  • Aim 1: Generate a scatter plot to visually examine the relationship between annual income and spending score.
  • Aim 2: Linear Regression analysis to examine the relationship between annual income and spending score.
  • Aim 3: K-means clustering to investigate the distribution of the relationship between annual income and spending score

Hypothesis 3: There is a positive relationship between middle-aged customers and spending score.

  • Aim 1: Categorized the age data into relevant groups (Teenager,Young A-B,Middle Age A-B,Elder A-B,Above 70) then visualise as boxplot and finding the spending means of each class to analyzed the average spending scores.
  • Aim 2: To examine the age group with the highest spending score, linear regression and correlation.

Conclusion

Hypothesis 1

  • There is no significant difference between the spending scores of males and females according to boxplot and two sample t-test.

Hypothesis 2

  • There is no positive relationship between annual income and spending score. K-means clustering result shows that different clusters have different annual income and spending scores.

Hypothesis 3

  • It cannot be said that the highest spending score is directly had by middle-aged people. There is a negative relationship between age and spending score, which is determined by correlation matrix and linear regression. It can be said that people aged 20-40 have the highest spending score

References

About

The aim of this project is to analyze the spending behavior of customer groups using various techniques.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages