# Anomaly Detection

Anomaly detection is a technique used in data analysis to identify patterns that do not conform to expected behavior. These non-conforming patterns are often referred to as anomalies or outliers. Detecting these anomalies is critical in various applications, such as fraud detection, network security, and fault detection.

## Basic Concept

In anomaly detection, we are given a dataset in which most instances are considered "normal". However, a small percentage of this data is anomalous. The main objective is to distinguish the anomalous instances from the normal ones.

For a simple visualization, consider plotting a dataset on a 2D plane. Most data points cluster around a region representing the "normal" data. Points that lie significantly far away from this cluster can be considered as anomalies.

## Mathematical Formulation

Given a dataset \( X \) with instances \( x_1, x_2, ..., x_m \), the aim is to determine a function \( f \) such that:

$$
f(x_i) = 
\begin{cases} 
1 & \text{if } x_i \text{ is an anomaly} \\
0 & \text{otherwise} 
\end{cases}
$$

Typically, in an unsupervised learning context, the function \( f \) is determined based on the statistical properties of the dataset, without having labeled instances.

## Gaussian Distribution

One common approach to anomaly detection is to assume that the "normal" data comes from a particular distribution, usually the Gaussian distribution. For a given feature \( x_i \):

1. Compute its mean ( \mu \) and variance \( \sigma^2 \).
2. Calculate the probability \( p(x_i) \) using the Gaussian distribution formula:

$$
p(x_i; \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x_i - \mu)^2}{2\sigma^2}}
$$

If \( p(x_i) \) is very small, it's likely that \( x_i \) is an anomaly.

## Challenges and Considerations

- **Choice of Features**: The choice of features can significantly impact the performance of the anomaly detection algorithm. Features should be chosen in such a way that anomalous instances receive a low probability.

- **High Dimensionality**: As the number of features increases, the computational cost can become prohibitive, and the curse of dimensionality can make instances seem equidistant, making it harder to spot anomalies.

- **Scalability**: Some algorithms can be computationally intensive, making it challenging to use them on large datasets.

## Conclusion

Anomaly detection plays a vital role in many applications where identifying rare events that deviate from the norm is crucial. By understanding the underlying patterns and distributions in data, we can better spot and react to these anomalies.


## Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.svm import OneClassSVM

## Example

Here's a simple example using the OneClassSVM from scikit-learn for anomaly detection. This model is an unsupervised algorithm that learns a decision function for anomaly detection: classifying new data as similar or different from the training set.

For this example, we'll generate synthetic data, with a majority of the data coming from a Gaussian distribution and a minority (the anomalies) from a uniform distribution:

In [None]:
# Generating synthetic data
np.random.seed(42)
X_normal = 0.3 * np.random.randn(100, 2)
X_anomaly = np.random.uniform(low=-4, high=4, size=(20, 2))
X = np.r_[X_normal + 2, X_normal - 2, X_anomaly]

# Training the OneClassSVM
clf = OneClassSVM(nu=0.1, kernel="rbf", gamma=0.1)
clf.fit(X)

# Predict if a particular sample is an outlier or not
y_pred = clf.predict(X)

# Plotting
plt.scatter(X[:, 0], X[:, 1], c=y_pred, edgecolors='k', cmap=plt.cm.Paired)
plt.title("Anomaly Detection using OneClassSVM")
plt.xlabel("Feature 1")
plt.ylabel("Feature 2")
plt.show()

In the resulting plot, you'll see two clusters of points which represent our "normal" data, and scattered around them are the anomalies. The color of the points will indicate if they are considered anomalies by the OneClassSVM (darker color typically represents anomalies).