# Anomaly Detection

Refers to the problem of finding patterns in data that do not conform to expected behavior. These nonconforming patterns are often referred to as anomalies, outliers, discordant observations, exceptions, aberrations, surprises, peculiarities, or contaminants in different application domains.

## Point Anomalies  
## Collective Anomalies  
## Contextual Anomalies  

## Challenges

A straightforward anomaly detection approach, therefore, is to define a region representing normal behavior and declare any observation in the data that does not belong to this normal region as an anomaly. But several factors make this approach very challenging:

- Defining a normal region that encompasses every behavior is difficult.
- The boundary between normal and anomalous behavior is not precise.
- When anomalies are the result of malicious actions, the malicious adversaries often adapt themselves to make the anomalous observations appear normal, thereby making the task of defining normal behavior more difficult.
- In many domains normal behavior keeps evolving and a current notion of normal behavior might not be sufficiently representative in the future.
The exact notion of an anomaly is different for different application domains (e.g. medical biometrics vs stock market).
- Availability of labeled data for training/validation of models used by anomaly detection techniques is usually a major issue.
- Often the data contains noise that tends to be similar to the actual anomalies and hence is difficult to distinguish and remove.
- In many domains normal behavior keeps evolving and a current notion of normal behavior might not be sufficiently representative in the future

## Nature of Input Data

- The nature of attributes determines the applicability of anomaly detection techniques.
- Identify the minimum aggregate level of the anomaly class (transaction, record, measure, etc.)
- Trying pairwise distance between features might be provided in the form of a distance or similarity matrix.
- Always take in consideration the scale and data type on every feature.

# Unsupervised Learning

$$
\renewcommand{\like}{{\cal L}}
\renewcommand{\loglike}{{\ell}}
\renewcommand{\err}{{\cal E}}
\renewcommand{\dat}{{\cal D}}
\renewcommand{\hyp}{{\cal H}}
\renewcommand{\Ex}[2]{E_{#1}[#2]}
\renewcommand{\x}{{\mathbf x}}
\renewcommand{\v}[1]{{\mathbf #1}}
$$

![](../../img/Fruits.jpg)

Unlike supervised learning, unsupervised learning is used with data sets without historical data. An unsupervised learning algorithm explores the data to find internal structures existing. Mathematically, we do not have any $y$ or **label** rather we consider the whole training data as a feature table $\x$. FThis kind of learning works best for transactional data; for instance, it can help in identifying customer segments and clusters with certain attributes; this is often used in content personalization.


![](../../img/Machine_learning_3.jpg)

![](../../img/recommender-systems.jpg)

<img src='../../img/amazon.png' height="600" width="800">

![](../../img/Outliers.jpeg) 

<img src='../../img/outlier.jpeg' height="300" width="500">

![](../../img/anomaly.png) 

![](../../img/topic.png) 

#### Basically, online recommendations, identification of data outliers, and segment text topics are all examples of unsupervised learning.