Customer-Segmentation

About

Customer segmentation is the process of dividing customers into groups based on common characteristics so companies can market to each group effectively and appropriately.

Context and Problem Statement

You are owning a supermarket mall and through membership cards , you have some basic data about your customers like Customer ID, age, gender, annual income and spending score. Spending Score is something you assign to the customer based on your defined parameters like customer behavior and purchasing data. You want to segment the customers into different clusters so that you can efficiently target the customers in order to generate more sales.

Dataset Used

The Dataset used for this project is the Mall Customers Dataset. This dataset has 5 different features. The feartures and their short descirption is mentioned in the table below:

Column Name	Description
Customer ID	A Unique Identification Feature assigned to each customer
Gender	Customers are classified on the basis of their genders viz. Male and Female
Age	Customers are classified on the basis of their Age
Annual Income	This Feature illustrates the Annual Income of Customers in Thousands
Spending Score	This is a feature in which a special spending score is assinged to each customer based on his/her buying behaviour and net spend

Algorithm Used

K-means Clustering is one of the simplest and popular unsupervised machine learning algorithms. Typically, unsupervised algorithms make inferences from datasets using only input vectors without referring to known, or labelled, outcomes. The objective of K-means is simple: group similar data points together and discover underlying patterns. To achieve this objective, K-means looks for a fixed number (k) of clusters in a dataset. A cluster refers to a collection of data points aggregated together because of certain similarities. You’ll define a target number k, which refers to the number of centroids you need in the dataset. A centroid is the imaginary or real location representing the center of the cluster. Every data point is allocated to each of the clusters through reducing the in-cluster sum of squares. In other words, the K-means algorithm identifies k number of centroids, and then allocates every data point to the nearest cluster, while keeping the centroids as small as possible. The ‘means’ in the K-means refers to averaging of the data; that is, finding the centroid.

Future Scope

The Future Scope of this project will be to built a user friendly web interface using the Shiny Package offered by R Studio. With the help of Shiny we can create really nice looking interfaces for our R project.

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
data		data
misc		misc
plots		plots
LICENSE		LICENSE
README.md		README.md
project.r		project.r

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer-Segmentation

About

Context and Problem Statement

Dataset Used

Algorithm Used

Future Scope

About

Releases

Packages

Contributors 2

Languages

License

BhakeSart/Customer-Segmentation

Folders and files

Latest commit

History

Repository files navigation

Customer-Segmentation

About

Context and Problem Statement

Dataset Used

Algorithm Used

Future Scope

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages