Market data Analysis using K-means Clustering

An Unsupervised machine learning algorithm to create a model with Python

What it is all about?

Taking a company's dataset about all purchased made and details of customers
Depending on these data we create Clusters to understand it in a more better way
Here we have done 5 clusters and given 5 different colours namely,
1. Red
2. Green
3. Blue
4. Black
5. Violet
All the co-ordinates are labelled accordingly. Here I took 3 labels
1. Customer groups
2. Spending scores (1-100)
3. Annual income
Here we can get the clear picture of customer-sales data.
Now it can be used to analyse and take a correct decision to increase profit and also user needs.

Packages used

Numpy
Pandas
Seaborn
Matplotlib
Sklearn

The Process

The Dataset that I have used for this project is from Kaggle

A little peak into dataset

Checked for any missing data in the csv file, these fill feed false data into our model and we will loose accuracy

Slicing of multiple columns

x=customer_data.iloc[:,[3,4]].values

Finding WCSS value for each clusters and store it for a list

WCSS -> Within Clusters Sum Of Squares Distance b/w each clusters and centroid

we get,

Observe sharp cuttings suggests significant drop

Training the KMeans model kmeans = KMeans(n_clusters=5,init='k-means++',random_state=0)
Doing prediction from the trained model, it'll give in ununderstandable format which is list of numbers
So we scatter all the clusters and their centroids
Based on x,y coordinate different colours have given to distinguish the clusters easily
Then using matplotlb we plot the graph like this

Conclusion

By visualising the data we can understand these like,
- Blue = less income and less purchase
- Purple = less income and more purchase
- Green = more income and less purchase
- Black = more income more purchase
Market can attract Blue group people providing some discounts
Market can attract Green region people who have money but not buying more things

Applications

Netflix suggesting group of people who are watching some genre more
Google ads personalisation

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Mall_Customers.csv		Mall_Customers.csv
MarketBasket.ipynb		MarketBasket.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mall_Customers.csv

Mall_Customers.csv

MarketBasket.ipynb

MarketBasket.ipynb

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Market data Analysis using K-means Clustering

What it is all about?

Packages used

The Process

The Dataset that I have used for this project is from Kaggle

Conclusion

Applications

About

Languages

vilasrhegde/Marketdata

Folders and files

Latest commit

History

Repository files navigation

Market data Analysis using K-means Clustering

What it is all about?

Packages used

The Process

The Dataset that I have used for this project is from Kaggle

Conclusion

Applications

About

Topics

Resources

Stars

Watchers

Forks

Languages