# Practical Exercises with Clustering Models (Exercise Solutions)
In this final section, readers will engage in practical exercises that involve building, tuning, and evaluating clustering models on real-world datasets. These exercises are designed to reinforce the concepts learned throughout the chapter and demonstrate how to effectively apply clustering techniques in various scenarios. By the end of this section, readers will have hands-on experience that they can leverage in their own ML projects.

## Exercise 1: Clustering with K-Means on the Iris Dataset
In this example, weâ€™ll apply K-Means clustering to the well-known Iris dataset and evaluate the results using multiple metrics.

### Implementation Steps:

In [None]:
# Load libraries
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score, adjusted_rand_score
import matplotlib.pyplot as plt

# Load the Dataset
iris = load_iris()
X = iris.data
y_true = iris.target

# Create and Train the KMeans Model
kmeans = KMeans(n_clusters=3, random_state=2024)
y_kmeans = kmeans.fit_predict(X)

# Evaluate the Clustering
sil_score = silhouette_score(X, y_kmeans)
ari = adjusted_rand_score(y_true, y_kmeans)

print(f"Silhouette Score: {sil_score:.3f}")
print(f"Adjusted Rand Index: {ari:.3f}")

# Visualize the Cluster Assignments (PCA Projection)
from sklearn.decomposition import PCA

X_pca = PCA(n_components=2).fit_transform(X)
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y_kmeans, cmap='viridis', s=50)
plt.title("K-Means Clustering on Iris Dataset")
plt.xlabel("PCA Component 1")
plt.ylabel("PCA Component 2")
plt.show()

## Exercise 2: Comparing DBSCAN and K-Means on Moons Data
This exercise demonstrates how DBSCAN can outperform K-Means on data with non-convex shapes.

### Implementation Steps:

In [None]:
# Load libraries
from sklearn.datasets import make_moons
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import DBSCAN, KMeans
import matplotlib.pyplot as plt

# Create and Scale the Dataset
X, _ = make_moons(n_samples=300, noise=0.1, random_state=2024)
X = StandardScaler().fit_transform(X)

# Apply KMeans and DBSCAN
kmeans = KMeans(n_clusters=2, random_state=2024)
y_kmeans = kmeans.fit_predict(X)

dbscan = DBSCAN(eps=0.3, min_samples=5)
y_dbscan = dbscan.fit_predict(X)

# Visualize the Clustering Results
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

ax1.scatter(X[:, 0], X[:, 1], c=y_kmeans, cmap='viridis', s=50)
ax1.set_title("K-Means Clustering")

ax2.scatter(X[:, 0], X[:, 1], c=y_dbscan, cmap='plasma', s=50)
ax2.set_title("DBSCAN Clustering")

plt.show()

## Exercise 3: Clustering High-Dimensional Data with PCA + GMM
This exercise combines dimensionality reduction with a probabilistic clustering approach using Gaussian Mixture Models.

### Implementation Steps:

In [None]:
# Load libraries
from sklearn.datasets import load_wine
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.mixture import GaussianMixture
import matplotlib.pyplot as plt

# Load and Preprocess the Dataset
data = load_wine()
X = StandardScaler().fit_transform(data.data)

# Apply PCA for Dimensionality Reduction
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

# Fit Gaussian Mixture Model
gmm = GaussianMixture(n_components=3, random_state=2024)
y_gmm = gmm.fit_predict(X_pca)

# Visualize the Clustered Output
plt.scatter(X_pca[:, 0], X_pca[:, 1], c=y_gmm, cmap='viridis', s=50)
plt.title("GMM Clustering with PCA on Wine Dataset")
plt.xlabel("PCA Component 1")
plt.ylabel("PCA Component 2")
plt.show()