# Anomaly Detection

This notebook shows some basic usage of CapyMOA for Anomaly Detection tasks.

---

*More information about CapyMOA can be found in* https://www.capymoa.org

**last update on 25/07/2024**

## 1. Unsupervised Anomaly Detection for data streams

* Recent research has been focused on unsupervised anomaly detection for data streams, as it is often difficult to obtain labeled data for training.
* Instead of using evaluation functions, we first use a basic **test-then-train loop** from scratch to evaluate the model's performance.
* Please notice that lower scores indicate higher anomaly likelihood.

In [4]:
from capymoa.datasets import ElectricityTiny
from capymoa.anomaly import HalfSpaceTrees
from capymoa.evaluation import AnomalyDetectionEvaluator
stream = ElectricityTiny()
schema = stream.get_schema()
learner = HalfSpaceTrees(schema)
evaluator = AnomalyDetectionEvaluator(schema)
while stream.has_more_instances():
    instance = stream.next_instance()
    score = learner.score_instance(instance)
    evaluator.update(instance.y_index, score)
    learner.train(instance)
    
auc = evaluator.auc()
print(f"AUC: {auc:.2f}")

AUC: 0.54


## 2. High-level evaluation functions

* CapyMOA provides `prequential_evaluation_anomaly` as a high level function to assess Anomaly Detectors


### 2.1 ```prequential_evaluation_anomaly```
In this example, we use the ```prequential_evaluation_anomaly``` function with ```plot_windowed_results``` to plot AUC for HalfSpaceTrees on Electricity

In [7]:
from capymoa.evaluation.visualization import plot_windowed_results
from capymoa.datasets import Electricity
from capymoa.anomaly import HalfSpaceTrees
from capymoa.evaluation import prequential_evaluation_anomaly

stream = Electricity()
hst = HalfSpaceTrees(schema=stream.get_schema())


results = prequential_evaluation_anomaly(stream=stream, learner=hst, window_size=4500, optimise=True)

results.windowed.metrics_per_window()

# plot_windowed_results(results, metric="AUC")

Unnamed: 0,classified instances,AUC,sAUC,Accuracy,Kappa,Periodical holdout AUC,Pos/Neg ratio,G-Mean,Recall,KappaM
0,4500.0,0.424887,0.101194,0.499333,-0.057376,0.0,1.542373,0.453178,0.601832,-0.275764
1,9000.0,0.487969,0.121366,0.506222,-0.006369,0.424887,1.176015,0.483169,0.612664,-0.159103
2,13500.0,0.468598,0.124515,0.497333,-0.015334,0.487969,1.085264,0.477359,0.613151,-0.133267
3,18000.0,0.411306,0.105785,0.483556,-0.072973,0.468598,1.384738,0.44772,0.585534,-0.180444
4,22500.0,0.39822,0.092782,0.486,-0.099953,0.411306,1.552467,0.421006,0.612715,-0.199938
5,27000.0,0.356802,0.081525,0.474444,-0.119603,0.39822,1.406417,0.398894,0.63308,-0.233055
6,31500.0,0.430827,0.104884,0.541556,0.007289,0.356802,1.567028,0.473213,0.675646,-0.088983
7,36000.0,0.402884,0.095072,0.518,-0.029875,0.430827,1.50139,0.457435,0.647908,-0.152191
8,40500.0,0.44244,0.100413,0.529333,0.00848,0.402884,1.306509,0.466961,0.693998,-0.120569
9,45000.0,0.428407,0.100096,0.479778,-0.069896,0.44244,1.190847,0.437049,0.626738,-0.217052
