Skip to content

zhongyuchen/outlier-detection

Repository files navigation

Outlier Detection

build status python version Apache License

Detect outliers with 3 methods: LOF, DBSCAN and one-class SVM

Prerequisites

  • Required packages can be installed with the following command:
pip install -r requirements.txt

Data

  • consumption_data.xls is provided. There are 4 columns with 940 entries. The first column denotes entry ID, which is ignored in detecting outliers. Therefore, the data entries are 3-dimensional.
  • Get numpy array data with size [940, 3] with the following code (check out dataset.py for implementation):
from dataset import get_dataset

data = get_dataset()
  • Data visualization:

data

Methods

For detailed descriptions please see report.pdf.

Density based method: LOF (local outlier factor)

  • Check out lof.py for implementation.
  • Result:

lof

Cluster based method: DBSCAN

  • Check out dbscan.py for implementation.
  • Result:

dbscan

Classification based method: One-class SVM

  • Check out svdd.py for implementation.
  • Result with Gaussian kernel:

rbf

  • Result with linear kernel:

linear

Author

Zhongyu Chen

About

Detect outliers with 3 methods: LOF, DBSCAN and one-class SVM

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages