##**Applying Hierarchical Clustering**

## Step 1: Import Required Libraries and Configure Settings

- Import the pandas, NumPy, and matplotlib.pyplot libraries
- Configure matplotlib settings


In [None]:
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
import numpy as np

## Step 2: Load the Dataset

- Load the data set Mall_customers.csv
- Display the first few rows of the data set
- Extract the necessary columns and store them in a variable called df


In [None]:
df = pd.read_csv('Mall_customers.csv')

In [None]:
df.head()

__Observations:__
- Here, we are using customer data from a mall.
- We have their gender, age, income, and spending score.

In [None]:
df.info()

__Observation:__
- Here, we can see that there are non-null values or no missing values.


- Create a df1

In [None]:
df1 = df.iloc[:, 3:5].values

In [None]:
df1

__Observation:__
- Here, we can see the annual income and spending score data.

## Step 3: Create a Dendrogram Using Scipy

- Import the scipy.cluster.hierarchy library
- Create a dendrogram of the data set


In [None]:
import scipy.cluster.hierarchy as shc

plt.figure(figsize=(10, 7))
plt.title("Customer Dendograms")
dend = shc.dendrogram(shc.linkage(df1, method='ward'))

In [None]:
plt.figure(figsize=(10, 7))
plt.title("Customer Dendograms")
dend = shc.dendrogram(shc.linkage(df1, method='complete'))

In [None]:
plt.figure(figsdfize=(10, 7))
plt.title("Customer Dendograms")
dend = shc.dendrogram(shc.linkage(df1, method='single'))

In [None]:
plt.figure(figsize=(10, 7))
plt.title("Customer Dendograms")
dend = shc.dendrogram(shc.linkage(df1, method='average'))

__Observations:__
- Here, we can see the dendrogram for customer data.
- As you can see, there are different clusters that have been created.
- Blue represents one cluster; red represents one cluster; and the entire green represents one whole cluster.


## Step 4: Perform Agglomerative Clustering to Identify More Clusters

- Import AgglomerativeClustering from sklearn.cluster
- Fit and predict the clusters using agglomerative clustering
- Display the predicted cluster labels


In [None]:
from sklearn.cluster import AgglomerativeClustering

model = AgglomerativeClustering(n_clusters=5, metric='euclidean', linkage='ward')
labels_=model.fit_predict(df1)

In [None]:
labels_

__Observation:__
- Here, we can see the labels.

## Step 5: Plot the Clusters

- Create a scatter plot of the data set with colors corresponding to the predicted cluster labels.


In [None]:
plt.figure(figsize=(10, 7))
plt.scatter(df1[:,0], df1[:,1], c=model.labels_, cmap='rainbow')

In [None]:
df.head()

In [None]:
df['cluster'] = labels_

In [None]:
df.drop(columns=['CustomerID','Gender']).groupby(["cluster"]).agg(["mean","median"])

__Observations:__
- Within the spread, we can see that five separate clusters have been created, forming the agglomerative cluster of five clusters.
-  The cluster is represented by red, green, blue, violet, and yellow.
