Skip to content

OutlierDetectionJL/OutlierDetectionData.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OutlierDetectionData

Documentation (stable) Documentation (dev) Build Status Coverage

OutlierDetectionData.jl is a package to download and read common outlier detection datasets. This package is a part of OutlierDetection.jl, the outlier detection ecosystem for Julia.

API Overview

The API currently is simple; we provide a single namespace per dataset collection. A dataset collection such as ODDS bundles multiple outlier detection datasets. For each dataset collection, the following methods are provided:

List all available datasets in the collection:

list()

List a subset of datasets starting with prefix:

list(prefix::Union{AbstractString, Regex})

Load a single dataset with name. This command automatically starts to download the file if the file does not exist. Currently, the data is returned as a tuple containing X::DataFrame and y::Vector{Int}, where X is a matrix of features with one observation per row and y represents the labels with "normal" indicating inliers and "outlier" indicating outliers.

load(name::AbstractString)

Example:

The following example shows how you can load the "cardio" dataset from the ODDS collection.

using OutlierDetectionData: ODDS

X, y = ODDS.load("cardio")

Available Collections:

The available collections are:

  • ODDS, Outlier Detection DataSets, Shebuti Rayana, 2016
  • ELKI, On the Evaluation of Unsupervised Outlier Detection, Campos et al., 2016
  • TSAD, The UCR Time Series Archive, Dau et al., 2018

For the TSAD collection, the class with the least members is chosen as the anomaly class and all other classes are defined as normal. If there are multiple classes, the lexically first class is chosen.

Licenses

Please make sure that you check and accept the licenses of the individual datasets before publishing your work. This package is licensed under the terms of the MIT license.

About

Easy way to use public outlier detection datasets with Julia

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages