Skip to content

Commit

Permalink
Merge pull request #428 from yzhao062/development
Browse files Browse the repository at this point in the history
v1.0.4
  • Loading branch information
yzhao062 committed Jul 29, 2022
2 parents c2839e0 + fe5eb15 commit 0027221
Show file tree
Hide file tree
Showing 22 changed files with 1,660 additions and 83 deletions.
3 changes: 3 additions & 0 deletions CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -164,4 +164,7 @@ v<1.0.2>, <06/21/2022> -- Add GMM detector (#402).
v<1.0.2>, <06/23/2022> -- Add ADBench Benchmark.
v<1.0.3>, <06/27/2022> -- Change default generation to new behaviors (#409).
v<1.0.3>, <07/04/2022> -- Add AnoGAN (#412).
v<1.0.4>, <07/29/2022> -- General improvement of code quality and test coverage.
v<1.0.4>, <07/29/2022> -- Add LUNAR (#413).
v<1.0.4>, <07/29/2022> -- Add LUNAR (#415).

10 changes: 9 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ Key Attributes of a fitted model:
ADBench Benchmark
^^^^^^^^^^^^^^^^^

We just released a 36-page, the most comprehensive `anomaly detection benchmark paper <https://www.andrew.cmu.edu/user/yuezhao2/papers/22-preprint-adbench.pdf>`_.
We just released a 36-page, the most comprehensive `anomaly detection benchmark paper <https://www.andrew.cmu.edu/user/yuezhao2/papers/22-preprint-adbench.pdf>`_ [#Han2022ADBench]_.
The fully `open-sourced ADBench <https://github.com/Minqi824/ADBench>`_ compares 30 anomaly detection algorithms on 55 benchmark datasets.

The organization of **ADBench** is provided below:
Expand Down Expand Up @@ -353,6 +353,8 @@ Neural Networks SO_GAAL Single-Objective Generative Adversarial
Neural Networks MO_GAAL Multiple-Objective Generative Adversarial Active Learning 2019 [#Liu2019Generative]_
Neural Networks DeepSVDD Deep One-Class Classification 2018 [#Ruff2018Deep]_
Neural Networks AnoGAN Anomaly Detection with Generative Adversarial Networks 2017 [#Schlegl2017Unsupervised]_
Graph-based R-Graph Outlier detection by R-graph 2017 [#You2017Provable]_
Graph-based LUNAR LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks 2022 [#Goodge2022Lunar]_
=================== ================== ====================================================================================================== ===== ========================================


Expand Down Expand Up @@ -579,8 +581,12 @@ Reference
.. [#Goldstein2012Histogram] Goldstein, M. and Dengel, A., 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. In *KI-2012: Poster and Demo Track*\ , pp.59-63.
.. [#Goodge2022Lunar] Goodge, A., Hooi, B., Ng, S.K. and Ng, W.S., 2022, June. Lunar: Unifying local outlier detection methods via graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence.
.. [#Gopalan2019PIDForest] Gopalan, P., Sharan, V. and Wieder, U., 2019. PIDForest: Anomaly Detection via Partial Identification. In Advances in Neural Information Processing Systems, pp. 15783-15793.
.. [#Han2022ADBench] Han, S., Hu, X., Huang, H., Jiang, M. and Zhao, Y., 2022. ADBench: Anomaly Detection Benchmark. arXiv preprint arXiv:2206.09426.
.. [#Hardin2004Outlier] Hardin, J. and Rocke, D.M., 2004. Outlier detection in the multiple cluster setting using the minimum covariance determinant estimator. *Computational Statistics & Data Analysis*\ , 44(4), pp.625-638.
.. [#He2003Discovering] He, Z., Xu, X. and Deng, S., 2003. Discovering cluster-based local outliers. *Pattern Recognition Letters*\ , 24(9-10), pp.1641-1650.
Expand Down Expand Up @@ -633,6 +639,8 @@ Reference
.. [#Wang2020adVAE] Wang, X., Du, Y., Lin, S., Cui, P., Shen, Y. and Yang, Y., 2019. adVAE: A self-adversarial variational autoencoder with Gaussian anomaly prior knowledge for anomaly detection. *Knowledge-Based Systems*.
.. [#You2017Provable] You, C., Robinson, D.P. and Vidal, R., 2017. Provable self-representation based outlier detection in a union of subspaces. In Proceedings of the IEEE conference on computer vision and pattern recognition.
.. [#Zhao2018XGBOD] Zhao, Y. and Hryniewicki, M.K. XGBOD: Improving Supervised Outlier Detection with Unsupervised Representation Learning. *IEEE International Joint Conference on Neural Networks*\ , 2018.
.. [#Zhao2019LSCP] Zhao, Y., Nasrullah, Z., Hryniewicki, M.K. and Li, Z., 2019, May. LSCP: Locally selective combination in parallel outlier ensembles. In *Proceedings of the 2019 SIAM International Conference on Data Mining (SDM)*, pp. 585-593. Society for Industrial and Applied Mathematics.
Expand Down
2 changes: 1 addition & 1 deletion docs/benchmark.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Benchmarks
Latest ADBench (2022)
---------------------

We just released a 36-page, the most comprehensive `anomaly detection benchmark paper <https://www.andrew.cmu.edu/user/yuezhao2/papers/22-preprint-adbench.pdf>`_.
We just released a 36-page, the most comprehensive `anomaly detection benchmark paper <https://www.andrew.cmu.edu/user/yuezhao2/papers/22-preprint-adbench.pdf>`_ :cite:`a-han2022adbench`.
The fully `open-sourced ADBench <https://github.com/Minqi824/ADBench>`_ compares 30 anomaly detection algorithms on 55 benchmark datasets.

The organization of **ADBench** is provided below:
Expand Down
2 changes: 2 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,8 @@ Neural Networks SO_GAAL Single-Objective Generative Adversarial A
Neural Networks MO_GAAL Multiple-Objective Generative Adversarial Active Learning 2019 :class:`pyod.models.mo_gaal.MO_GAAL` :cite:`a-liu2019generative`
Neural Networks DeepSVDD Deep One-Class Classification 2018 :class:`pyod.models.deep_svdd.DeepSVDD` :cite:`a-ruff2018deepsvdd`
Neural Networks AnoGAN Anomaly Detection with Generative Adversarial Networks 2017 :class:`pyod.models.anogan.AnoGAN` :cite:`a-schlegl2017unsupervised`
Graph-based R-Graph Outlier detection by R-graph 2017 :class:`pyod.models.rgraph.RGraph` :cite:`you2017provable`
Graph-based LUNAR LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks 2022 :class:`pyod.models.lunar.LUNAR` :cite:`a-goodge2022lunar`
=================== ================ ====================================================================================================== ===== =================================================== ======================================================


Expand Down
19 changes: 18 additions & 1 deletion docs/pyod.models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ pyod.models.auto\_encoder\_torch module

.. automodule:: pyod.models.auto_encoder_torch
:members:
:undoc-members:
:exclude-members: inner_autoencoder
:show-inheritance:
:inherited-members:

Expand Down Expand Up @@ -209,6 +209,15 @@ pyod.models.loci module
:show-inheritance:
:inherited-members:

pyod.models.lunar module
------------------------

.. automodule:: pyod.models.lunar
:members:
:exclude-members: SCORE_MODEL, WEIGHT_MODEL
:show-inheritance:
:inherited-members:

pyod.models.lscp module
-----------------------

Expand Down Expand Up @@ -267,6 +276,14 @@ pyod.models.pca module
:show-inheritance:
:inherited-members:

pyod.models.rgraph module
-------------------------

.. automodule:: pyod.models.rgraph
:members:
:undoc-members:
:show-inheritance:
:inherited-members:

pyod.models.rod module
----------------------
Expand Down
25 changes: 25 additions & 0 deletions docs/zreferences.bib
Original file line number Diff line number Diff line change
Expand Up @@ -433,4 +433,29 @@ @inproceedings{schlegl2017unsupervised
pages={146--157},
year={2017},
organization={Springer}
}

@inproceedings{goodge2022lunar,
title={Lunar: Unifying local outlier detection methods via graph neural networks},
author={Goodge, Adam and Hooi, Bryan and Ng, See-Kiong and Ng, Wee Siong},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
volume={36},
number={6},
pages={6737--6745},
year={2022}
}

@article{han2022adbench,
title={ADBench: Anomaly Detection Benchmark},
author={Han, Songqiao and Hu, Xiyang and Huang, Hailiang and Jiang, Mingqi and Zhao, Yue},
journal={arXiv preprint arXiv:2206.09426},
year={2022}
}

@inproceedings{you2017provable,
title={Provable self-representation based outlier detection in a union of subspaces},
author={You, Chong and Robinson, Daniel P and Vidal, Ren{\'e}},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
pages={3395--3404},
year={2017}
}
Binary file modified examples/ALL.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 18 additions & 16 deletions examples/compare_all_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@
from pyod.models.ocsvm import OCSVM
from pyod.models.pca import PCA
from pyod.models.lscp import LSCP
from pyod.models.inne import INNE
from pyod.models.gmm import GMM
from pyod.models.kde import KDE
from pyod.models.lmdd import LMDD

# TODO: add neural networks, LOCI, SOS, COF, SOD

Expand Down Expand Up @@ -87,26 +91,20 @@
contamination=outliers_fraction),
'Average KNN': KNN(method='mean',
contamination=outliers_fraction),
# 'Median KNN': KNN(method='median',
# contamination=outliers_fraction),
'Local Outlier Factor (LOF)':
LOF(n_neighbors=35, contamination=outliers_fraction),
# 'Local Correlation Integral (LOCI)':
# LOCI(contamination=outliers_fraction),
'Minimum Covariance Determinant (MCD)': MCD(
contamination=outliers_fraction, random_state=random_state),
'One-class SVM (OCSVM)': OCSVM(contamination=outliers_fraction),
'Principal Component Analysis (PCA)': PCA(
contamination=outliers_fraction, random_state=random_state),
# 'Stochastic Outlier Selection (SOS)': SOS(
# contamination=outliers_fraction),
'Locally Selective Combination (LSCP)': LSCP(
detector_list, contamination=outliers_fraction,
random_state=random_state),
# 'Connectivity-Based Outlier Factor (COF)':
# COF(n_neighbors=35, contamination=outliers_fraction),
# 'Subspace Outlier Detection (SOD)':
# SOD(contamination=outliers_fraction),
'INNE': INNE(contamination=outliers_fraction),
'GMM': GMM(contamination=outliers_fraction),
'KDE': KDE(contamination=outliers_fraction),
'LMDD': LMDD(contamination=outliers_fraction),
}

# Show all detectors
Expand All @@ -125,7 +123,7 @@
X = np.r_[X, np.random.uniform(low=-6, high=6, size=(n_outliers, 2))]

# Fit the model
plt.figure(figsize=(15, 12))
plt.figure(figsize=(15, 16))
for i, (clf_name, clf) in enumerate(classifiers.items()):
print()
print(i + 1, 'fitting', clf_name)
Expand All @@ -139,11 +137,11 @@

Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()]) * -1
Z = Z.reshape(xx.shape)
subplot = plt.subplot(3, 4, i + 1)
subplot = plt.subplot(4, 4, i + 1)
subplot.contourf(xx, yy, Z, levels=np.linspace(Z.min(), threshold, 7),
cmap=plt.cm.Blues_r)
a = subplot.contour(xx, yy, Z, levels=[threshold],
linewidths=2, colors='red')
# a = subplot.contour(xx, yy, Z, levels=[threshold],
# linewidths=2, colors='red')
subplot.contourf(xx, yy, Z, levels=[threshold, Z.max()],
colors='orange')
b = subplot.scatter(X[:-n_outliers, 0], X[:-n_outliers, 1], c='white',
Expand All @@ -152,8 +150,12 @@
s=20, edgecolor='k')
subplot.axis('tight')
subplot.legend(
[a.collections[0], b, c],
['learned decision function', 'true inliers', 'true outliers'],
[
# a.collections[0],
b, c],
[
# 'learned decision function',
'true inliers', 'true outliers'],
prop=matplotlib.font_manager.FontProperties(size=10),
loc='lower right')
subplot.set_xlabel("%d. %s (errors: %d)" % (i + 1, clf_name, n_errors))
Expand Down
1 change: 0 additions & 1 deletion examples/lscp_example.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@

from pyod.models.lscp import LSCP
from pyod.models.lof import LOF
from pyod.utils.utility import standardizer
from pyod.utils.data import generate_data
from pyod.utils.data import evaluate_print
from pyod.utils.example import visualize
Expand Down
52 changes: 52 additions & 0 deletions examples/lunar_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# -*- coding: utf-8 -*-
"""Example of using LUNAR for outlier detection
"""
# Author: Adam Goodge <a.goodge@u.nus.edu>
#

from __future__ import division
from __future__ import print_function

import os
import sys

# temporary solution for relative imports in case pyod is not installed
# if pyod is installed, no need to use the following line
sys.path.append(os.path.abspath(os.path.join(os.path.dirname("__file__"), '..')))

from pyod.models.lunar import LUNAR
from pyod.utils.data import generate_data
from pyod.utils.data import evaluate_print

if __name__ == "__main__":
contamination = 0.1 # percentage of outliers
n_train = 5000 # number of training points
n_test = 1000 # number of testing points
n_features = 100 # number of features

# Generate sample data
X_train, X_test, y_train, y_test = \
generate_data(n_train=n_train,
n_test=n_test,
n_features=n_features,
contamination=contamination,
random_state=42)

# train LUNAR detector
clf_name = 'LUNAR'
clf = LUNAR()
clf.fit(X_train)

# get the prediction labels and outlier scores of the training data
y_train_pred = clf.labels_ # binary labels (0: inliers, 1: outliers)
y_train_scores = clf.decision_scores_ # raw outlier scores

# get the prediction on the test data
y_test_pred = clf.predict(X_test) # outlier labels (0 or 1)
y_test_scores = clf.decision_function(X_test) # outlier scores

# evaluate and print the results
print("\nOn Training Data:")
evaluate_print(clf_name, y_train, y_train_scores)
print("\nOn Test Data:")
evaluate_print(clf_name, y_test, y_test_scores)
Loading

0 comments on commit 0027221

Please sign in to comment.