# üß† DBSCAN for Anomaly Detection

Hello guys üëã  

So we are going to continue the discussion with respect to **Anomaly Detection**.

In this video, we will understand how we can perform **Anomaly Detection using DBSCAN Clustering**.

---

## üîÅ Quick Recap

We have already discussed the **theoretical concept** of DBSCAN ‚Äî how clustering actually happens.

The most important feature of DBSCAN is that it can create **clusters in a non-linearly separable dataset**, which is its biggest advantage.

Because DBSCAN can handle **non-linear structures**, it also helps in detecting **outliers**, which is useful for **anomaly detection**.

---

## ‚öôÔ∏è Key Concepts

DBSCAN categorizes data points into three types:

- <span style="color:#e63946;">Core Point</span>  
- <span style="color:#ff9f1c;">Border Point</span>  
- <span style="color:#2ec4b6;">Noise / Outlier</span>  

For anomaly detection, the **Noise or Outliers** play a crucial role, as they represent **abnormal data points**.

---

## üß© How DBSCAN Works

DBSCAN uses two hyperparameters:

1. **Epsilon (Œµ)** ‚Üí Radius of neighborhood  
2. **MinPts** ‚Üí Minimum number of points required to form a dense region

---

### üî¥ Core Point

A point is a <span style="color:#e63946;">core point</span> if:

$$
\text{Number of points within radius (Œµ)} \geq \text{MinPts}
$$

Example:  
If `MinPts = 4` and a point has 6 neighbors within Œµ,  
then it is a **core point**.

---

### üü° Border Point

A point is a <span style="color:#ff9f1c;">border point</span> if:

$$
\text{Number of points within radius (Œµ)} < \text{MinPts}
$$

Example:  
If `MinPts = 4` and a point has only 3 neighbors,  
then it is a **border point**.

---

### üîµ Outlier (Noise Point)

A point is considered an <span style="color:#2ec4b6;">outlier</span> (or **noise point**) if:

$$
\text{No other points exist within radius (Œµ)}
$$

These are the **abnormal points** ‚Äî our **anomalies**.

---

## üß† Visualization

In DBSCAN clustering:

- **Red points** ‚Üí Core points  
- **Yellow points** ‚Üí Border points  
- **Blue points** ‚Üí Noise / Outliers  

In anomaly detection, our **focus is on the outliers**, as they represent the data points that deviate significantly from normal patterns.

---

## üìä Practical Example

Let‚Äôs look at a practical example in Python üëá

```python
from sklearn.datasets import make_circles
from sklearn.cluster import DBSCAN
import matplotlib.pyplot as plt

# Generate circular data with noise
X, y = make_circles(n_samples=750, factor=0.5, noise=0.05)

# Apply DBSCAN
db = DBSCAN(eps=0.10, min_samples=4)
labels = db.fit_predict(X)

# Plot clusters
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='viridis')
plt.title("DBSCAN Clustering - Anomaly Detection")
plt.show()


Explanation

eps = 0.10 ‚Üí defines the neighborhood radius

min_samples = 4 ‚Üí defines how many nearby points are needed to form a dense region

Points with label = -1 are outliers

In the output plot:

You will see clusters (colored groups)

And outliers (points labeled as -1), usually scattered around the edges

üìå Observations

DBSCAN can detect non-linear clusters effectively

Points with label -1 are outliers/anomalies

You can tune Œµ and MinPts to control cluster density and sensitivity to noise

üåü Summary
Type	Condition	Meaning
<span style="color:#e63946;">Core Point</span>	Points ‚â• MinPts within Œµ	Dense region
<span style="color:#ff9f1c;">Border Point</span>	Points < MinPts within Œµ	Edge of a cluster
<span style="color:#2ec4b6;">Outlier</span>	No points within Œµ	Anomaly / Noise