The aim of the project is to analyse data related to a wholesale distributor and to obtain a segmentation of its customers based on the annual amount of money spent by the buyers on the different categories of products.
The work is divided into three main parts:
- Data exploration and identification of outliers through Mahalanobis distance.
- Data dimensionality reduction: Multidimensional Scaling and PCA techniques.
- Clustering: k-mean and model-based clustering.