## Domain

This is a canonical dataset taken from the UCI Machine Learning Repository. 

It is data collected from the sale of products by a grocery wholesaler in Portugal. The values are in "monetary units".

## Problem

While this task includes labels, it is typically used for clustering or unsupervised learning work. Here we will keep the labels for comparison but will be exploring this dataset in the context of EDA and unsupervised learning in an EDA setting. 

Two columns will work as label for this set `region` and `channel`.

## Solution

We will be working toward a customer segmentation generated by a cluster analysis.

## Data

The following analysis shows:

- there are 440 rows and 8 useful variable columns in the dataset. Two of these columns are target features `Region` and `Channel`. 

- there are six integer value columns:
   - `Fresh`
   - `Milk`
   - `Grocery`
   - `Frozen`
   - `Detergents_Paper`
   - `Delicatessen`


In [None]:
customers <- read.table('Wholesale_customers_data.csv', sep=",", header = T)

In [None]:
head(customers)

In [None]:
dim(customers); str(customers)

In [None]:
customers$Channel = factor(customers$Channel)
customers$Region = factor(customers$Region)

In [None]:
str(customers)

In [None]:
cust_sum = summary(customers)
cust_sum

In [None]:
customer_features = Filter(is.numeric, customers)

In [None]:
library(repr)
options(repr.plot.width=20, repr.plot.height=6)

In [None]:
sum_vals = data.frame(feature=colnames(customer_features))
sum_vals['mean_'] = sapply(customer_features, mean)
sum_vals['median_'] = sapply(customer_features, median)
sum_vals['sd_'] = sapply(customer_features, sd)
sum_vals

In [None]:
library(reshape2)

In [None]:
library(ggplot2)

ggplot(melt(sum_vals), aes(x = feature, y = value, fill = variable)) + 
    geom_bar(stat = "identity", width=0.5, position = "dodge")

## Benchmark

As this is an EDA and Unsupervised Learning task, we will not define an explicit benchmark. 

## Metrics

We will not define a benchmarl for this project at this time. 