<a href="https://colab.research.google.com/github/Scottymichaelmillerguy/Clustering_Algorithms/blob/main/Clustering_Algorithms.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Clustering Algorithnms

## From ML Algorithms to GenAI & LLMs by Aman Kharwal

# K-Means

Here is how to implement the K-means algorithm using Python

In [1]:
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs

# Generating synthetic data for clustering
X, _ = make_blobs(n_samples=100, centers=3, random_state=42)

# Creating an instance of KMeans algorithm
kmeans = KMeans(n_clusters=3, init='random', random_state=42)

# Fitting the algorithm to the data
kmeans.fit(X)

# Obtaining the cluster labels for each data point
labels = kmeans.labels_

Here is how to visualize the clusters

In [2]:
import plotly.express as px
import pandas as pd
# Creating a DataFrame with the data and cluster labels
df = pd.DataFrame(X, columns=['x', 'y'])
df['cluster'] = labels.astype(str)

# Visualizing the clusters
fig = px.scatter(df, x='x', y='y', color='cluster', title='K-means Clustering')
fig.show()

# DBSCAN

Here is how to implement the DBSCAN algorithm using Python

In [3]:
from sklearn.cluster import DBSCAN
from sklearn.datasets import make_blobs

# Generating synthetic data for clustering
X, _ = make_blobs(n_samples=100, centers=3, random_state=42)

# Creating an instance of the DBSCAN class
dbscan = DBSCAN(eps=0.5, min_samples=5)

# Fitting the DBSCAN model to the data
dbscan.fit(X)

# Accessing the labels assigned to each data point
labels = dbscan.labels_

Here is how to visualize the clusters

In [4]:
df = pd.DataFrame(X, columns=['x', 'y'])
df['cluster'] = labels.astype(str)

# Visualizing the clusters
fig = px.scatter(df, x='x', y='y', color='cluster', title='DBSCAN Clustering')
fig.show()

# Agglomerative Clustering

Here is how to implement Agglomerative clustering algorithm using Python

In [5]:
from sklearn.cluster import AgglomerativeClustering
from sklearn.datasets import make_blobs
# Generating synthetic data for clustering
X, _ = make_blobs(n_samples=100, random_state=42)

# Perform Agglomerative Clustering
agglomerative = AgglomerativeClustering(n_clusters=3)
labels = agglomerative.fit_predict(X)


Here is how to visualize the clusters

In [6]:
import pandas as pd
import plotly.express as px

df = pd.DataFrame(X, columns=['x', 'y'])
df['cluster'] = labels.astype(str)

# Visualizing the clusters
fig = px.scatter(df, x='x', y='y', color = 'cluster', title='Agglomerative Clustering')
fig.show()

# BIRCH Clustering

Here's how to implement BIRCH clustering and algorithm using Python

In [7]:
from sklearn.cluster import Birch
from sklearn.datasets import make_blobs

# Generating synthetic data for clustering
X, _ = make_blobs(n_samples=100, random_state=42)

# BIRCH Clustering
birch = Birch(threshold=0.5, branching_factor = 50)
birch.fit(X)

# Obtaining cluster labels
labels = birch.labels_

Here is how to visualize the clusters

In [8]:
df = pd.DataFrame(X, columns=['x', 'y'])
df['cluster'] = labels.astype(str)

# Visualizing the clusters
fig = px.scatter(df, x='x', y='y', color='cluster', title='BIRCH Clustering')
fig.show()

# Mean Shift Clustering

Here is how to implement the Mean Shift clustering algorithm using Python

In [9]:
from sklearn.cluster import MeanShift
from sklearn.datasets import make_blobs

# Generating synthetic data for clustering
X, _ = make_blobs(n_samples=100, random_state=42)

# Applying Mean Shift Clustering
ms = MeanShift()
ms.fit(X)

# Obtaining the cluster labels
labels = ms.labels_