Skip to content

Randomization variants of Isolation Forest anomaly detection algorithm.

Notifications You must be signed in to change notification settings

selimfirat/iforest-variants

Repository files navigation

Isolation Forest Variants

This repository contains reproducible code of ablation study to research randomization in isolation forest anomaly detection algorithm.

Currently, the repository contains the following isolation forest variants:

  • Isolation Forest (3 original implementations which are mine, sklearn's, and pyod's.)
  • Extended Isolation Forest (See citations section below.)
  • Isolation Forest which samples an attribute value instead of choosing randomly as in original version.

Requirements

  • Anaconda3

Setup

  • conda env create -f environment.yml
  • conda activate iforest-variants
  • python experiments.py

Citations

If you use iForest, please consider using following references:

@inproceedings{liu2008isolation,
  title={Isolation forest},
  author={Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua},
  booktitle={2008 Eighth IEEE International Conference on Data Mining},
  pages={413--422},
  year={2008},
  organization={IEEE}
}
@article{liu2012isolation,
  title={Isolation-based anomaly detection},
  author={Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua},
  journal={ACM Transactions on Knowledge Discovery from Data (TKDD)},
  volume={6},
  number={1},
  pages={3},
  year={2012},
  publisher={Acm}
}

If you use extended iForest, please consider using following reference:

@ARTICLE{2018arXiv181102141H,
   author = {{Hariri}, S. and {Carrasco Kind}, M. and {Brunner}, R.~J.},
    title = "{Extended Isolation Forest}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1811.02141},
 keywords = {Computer Science - Machine Learning, Statistics - Machine Learning},
     year = 2018,
    month = nov,
   adsurl = {http://adsabs.harvard.edu/abs/2018arXiv181102141H},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

If you use scikit-learn or PyOD implementation of iForest, please consider using following references:

@article{pedregosa2011scikit,
  title={Scikit-learn: Machine learning in Python},
  author={Pedregosa, Fabian and Varoquaux, Ga{\"e}l and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others},
  journal={Journal of machine learning research},
  volume={12},
  number={Oct},
  pages={2825--2830},
  year={2011}
}
@article{zhao2019pyod,
  title={PyOD: A python toolbox for scalable outlier detection},
  author={Zhao, Yue and Nasrullah, Zain and Li, Zheng},
  journal={arXiv preprint arXiv:1901.01588},
  year={2019}
}

About

Randomization variants of Isolation Forest anomaly detection algorithm.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages