Skip to content

The technique of separating consumers into distinct categories depending on specific characteristics is known as customer segmentation

Notifications You must be signed in to change notification settings

Udaybhan19/Customer_Segmentation_with_k-mean_algorithm-

Repository files navigation

customer-segmentation-python:

This project applies customer segmentation to the customer data from a company Ulabox and derives conclusions and data driven ideas based on it.

About Ulabox:

Ulabox is the most successful pure-player online grocery in Spain. It picks up more than €1 million in monthly revenue and asserts a customer satisfaction above 95%.

Dataset

File: [customer_segmentation_2.csv] GitHub : [https://github.com/ulabox/datasets/tree/master/data]

customer_segmentation_2.csv dataset includes a subset of anonymized 30k orders from the beginning of 2017. All kind of customers (around 10k) are represented in this dataset: from urban and rural areas, from first-timers to loyal customers. And customer_segmentation_final_datasets.csv file is the final dataset after modified the customer_segmentation_2 dataset

Data dictionary

The dataset contains 30k samples with the following features:

  • customer: Anonymized customer's id.
  • order: Order id, starting from zero.
  • total_items : The number of items purchased in the order.
  • discount% : The percent of total discount received. For instance, if the customer saves €20 in a €100 order (that is, he had to pay €80), this field will contain a 20.
  • weekday : Day of the week when the order was paid. 1=Monday, 7=Sunday.
  • hour : The hour of the day the purchase was done. From 00 to 23.
  • Categories' partials : Percent of money spent in each of the 8 website's main categories:
    • Food% : Non perishable food, for example: rice, cooking oil, snacks, cookies, sauces, canned food.
    • Fresh% : Fresh and frozen food, for example: fresh tuna, fruits, frozen pizza, salads, meat.
    • Drinks% : All kind of beverages, like: water, juices, wine, alcoholic drinks, milk, soy drinks.
    • Home% : Products for home, from toilet paper to small appliances.
    • Beauty% : Items for cleaning your body and makeup; for example: shampoo, shaving foam, cosmetics.
    • Health% : Medicinal solutions that can be sold in Spain without medic prescription: diet pills, condoms, tooth paste.
    • Baby% : Useful articles if you have a baby: diapers, baby food, baby care.
    • Pets% : Items related with dogs, cats and other pets; like food, toys, sanitary sand.

Customer segmentation

In customer segmentation we categorize similar customers together in the same cluster and analyse them. It can reveal information like:

  1. who are the most valuable customers of the company
  2. what kinds of customers does the company have
  3. This can be used for targeted marketing and other marketing strategies.
  4. Sometimes it can even reveal a potential white space in the market place which no company has yet occupied. Well we can get creative here.

Clustering

Clustering is a process in which we put similar data points into the same cluster. There are a lot of algorithms to do this, for example agglomerative heirarchical clustering, kmeans clustering, Gaussian Mixture Model etc. We segmente the data with the help of Kmeans Clustering.

Thank you for your time :)

About

The technique of separating consumers into distinct categories depending on specific characteristics is known as customer segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published