Skip to content

ankane/rcf-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Random Cut Forest CLI

Command-line interface for Random Cut Forest anomaly detection

Build Status

Installation

Download the latest version:

You can also install it with Homebrew:

brew install ankane/brew/rcf

Getting Started

Prepare a CSV. Each row should be a data point. Here’s an example with three dimensions:

$ tail input.csv
5.0001,0.0006,-0.0084
-4.9998,0.0041,0.0025
-5.0096,-0.0085,-0.0007
4.9980,-0.0049,-0.0029
5.0083,0.0078,-0.0029
0.0001,-0.0052,0.0003
5.0051,0.0081,-0.0068
5.0044,-0.0030,-0.0016
-5.0028,-0.0067,-0.0056
4.9921,0.0092,0.0049

And run:

rcf < input.csv > output.csv

The program outputs the anomaly score as a new column. Anomalous data points are given a larger score.

$ tail output.csv
5.0001,0.0006,-0.0084,0.7980522827802555
-4.9998,0.0041,0.0025,0.7071138288972222
-5.0096,-0.0085,-0.0007,1.018823150444355
4.9980,-0.0049,-0.0029,0.7408316990007481
5.0083,0.0078,-0.0029,0.8277566916765301
0.0001,-0.0052,0.0003,2.8337001378872664
5.0051,0.0081,-0.0068,0.7709170753009097
5.0044,-0.0030,-0.0016,0.7204096423354174
-5.0028,-0.0067,-0.0056,0.8198039786393572
4.9921,0.0092,0.0049,0.9185893539878859

Options

  -d, --delimiter <DELIMITER>        The character used as a field delimiter [default: ,]
      --header-row                   Pass if the data contains a header row
  -n, --number-of-trees <TREES>      Number of trees to use in the forest [default: 100]
  -r, --random-seed <RANDOM_SEED>    Random seed to use [default: 42]
  -s, --sample-size <SAMPLE_SIZE>    Points to keep in sample for each tree [default: 256]
  -c, --shingle-cyclic               Use cyclic shingles instead of linear shingles
  -g, --shingle-size <SHINGLE_SIZE>  Shingle size to use [default: 1]
  -w, --window-size <WINDOW_SIZE>    Window size of the sample or 0 for no window [default: 0]
  -h, --help                         Print help information
  -V, --version                      Print version information

References

Credits

This project is modeled after the Java CLI.

History

View the changelog

Contributing

Everyone is encouraged to help improve this project. Here are a few ways you can help:

To get started with development:

git clone https://github.com/ankane/rcf-cli.git
cd rcf-cli
cargo run --release < example-data/input.csv