### DBSCAN 
- DBSCAN is an unsupervised clustering algorithm that groups data points based on density, making it particularly effective for identifying cluster of arbitrary shapes and for handling noise (outliers). DBSCAN required two parameters : eps (the maximum distance between two points to be considered neighbors ) and min_sample (the minimum number of points required to form a dense region )

In [1]:
from sklearn.cluster import DBSCAN
import numpy as np

In [3]:
x = np.array([[1, 2], [2, 2], [2, 3], [8, 7], [8, 8], [25, 80]])

# initialize DBSCAN and fit thewe model
dbscan = DBSCAN(eps=3, min_samples=2) # DBSCAN forms a cluster if there are at least min samples points within eps distance 
# what is eps? eps is the maximum distance between two samples for one to be considered as in the neighborhood of the other.
# min_samples is the number of samples in a neighborhood for a point to be considered as a

dbscan.fit(x) 

labels = dbscan.labels_ # labels are the cluster labels assigned to each point
print("Labels:", labels)

# this algorithm will assign -1 to noise points. which are points that do not belong to any cluster 
# we can also get the core samples, which are the points that are at the center of the clusters 
core_samples = dbscan.core_sample_indices_ # these are the indices of the core samples 
print("Core samples:", core_samples)


Labels: [ 0  0  0  1  1 -1]
Core samples: [0 1 2 3 4]


In [4]:
# the above output shows that the first three points belong to one cluster, the next three points belong to another cluster, and the last point its considered noise 
# we can also get the number of clusters found by the algorithm 
# the core samples are the points that are at the center of the clusters and the labels are the cluster labels assigned to each point 
