This repository contains reproducible code of ablation study to research randomization in isolation forest anomaly detection algorithm.
Currently, the repository contains the following isolation forest variants:
- Isolation Forest (3 original implementations which are mine, sklearn's, and pyod's.)
- Extended Isolation Forest (See citations section below.)
- Isolation Forest which samples an attribute value instead of choosing randomly as in original version.
- Anaconda3
conda env create -f environment.yml
conda activate iforest-variants
python experiments.py
If you use iForest, please consider using following references:
@inproceedings{liu2008isolation,
title={Isolation forest},
author={Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua},
booktitle={2008 Eighth IEEE International Conference on Data Mining},
pages={413--422},
year={2008},
organization={IEEE}
}
@article{liu2012isolation,
title={Isolation-based anomaly detection},
author={Liu, Fei Tony and Ting, Kai Ming and Zhou, Zhi-Hua},
journal={ACM Transactions on Knowledge Discovery from Data (TKDD)},
volume={6},
number={1},
pages={3},
year={2012},
publisher={Acm}
}
If you use extended iForest, please consider using following reference:
@ARTICLE{2018arXiv181102141H,
author = {{Hariri}, S. and {Carrasco Kind}, M. and {Brunner}, R.~J.},
title = "{Extended Isolation Forest}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1811.02141},
keywords = {Computer Science - Machine Learning, Statistics - Machine Learning},
year = 2018,
month = nov,
adsurl = {http://adsabs.harvard.edu/abs/2018arXiv181102141H},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}
If you use scikit-learn or PyOD implementation of iForest, please consider using following references:
@article{pedregosa2011scikit,
title={Scikit-learn: Machine learning in Python},
author={Pedregosa, Fabian and Varoquaux, Ga{\"e}l and Gramfort, Alexandre and Michel, Vincent and Thirion, Bertrand and Grisel, Olivier and Blondel, Mathieu and Prettenhofer, Peter and Weiss, Ron and Dubourg, Vincent and others},
journal={Journal of machine learning research},
volume={12},
number={Oct},
pages={2825--2830},
year={2011}
}
@article{zhao2019pyod,
title={PyOD: A python toolbox for scalable outlier detection},
author={Zhao, Yue and Nasrullah, Zain and Li, Zheng},
journal={arXiv preprint arXiv:1901.01588},
year={2019}
}