Skip to content

This project identify segments of the population that form the core customer base for a mail-order sales company in Germany. These segments can then be used to direct marketing campaigns towards audiences that will have the highest expected rate of returns. The data has been provided by Bertelsmann Arvato Analytics.

Notifications You must be signed in to change notification settings

haataa/Identify_Customer_Segments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Table of Contents

  1. Project Motivation
  2. Installation
  3. File Descriptions
  4. Results
  5. Licensing, Authors, and Acknowledgements

Project Motivation

This project identify segments of the population that form the core customer base for a mail-order sales company in Germany. These segments can then be used to direct marketing campaigns towards audiences that will have the highest expected rate of returns. The data has been provided by Bertelsmann Arvato Analytics.

Installation

libraries needed by this project are provided by the Anaconda distribution of Python. Packages include numpy, pandas, matplotlib, seaborn, ast,sklearn and scipy The code should run with no issues using Python versions 3.*.

File Descriptions

  1. Data folder contain data needed in this project
  • Udacity_AZDIAS_Subset.csv: Demographics data for the general population of Germany; 891211 persons (rows) x 85 features (columns).
  • Udacity_CUSTOMERS_Subset.csv: Demographics data for customers of a mail-order company; 191652 persons (rows) x 85 features (columns).
  • Data_Dictionary.md: Detailed information file about the features in the provided datasets.
  • AZDIAS_Feature_Summary.csv: Summary of feature attributes for demographics data; 85 features (rows) x 4 columns
  1. Identify_Customer_Segments notebook record detail of the project.
  2. img folder contain result screenshots.

Result

Blog post of the project can be found in medium

Customer are clusted into 10 clusters according to cluster numbers - SSE plot. Screenshot 4

Cluster distribution is very different between the general population and company customers. Screenshot 1

To see the difference of each cluster I draw another graph. Screenshot 2

To have a visual of our cluster, I plot cluster distribution with two important features: GREEN_AVANTGARDE and FINANZ_MINIMALIST. Screenshot 3

Licensing, Authors, Acknowledgements

This project was completed as part of the Udacity Data Scientist Nanodegree.

About

This project identify segments of the population that form the core customer base for a mail-order sales company in Germany. These segments can then be used to direct marketing campaigns towards audiences that will have the highest expected rate of returns. The data has been provided by Bertelsmann Arvato Analytics.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published