Skip to content
Everything ROC and Precision-Recall curves.
Branch: master
Clone or download
afbarnard Added support for ties in the ranking.
* Fixes issue #16.
* Updated build method to look for tied scores in the ranking and to
  keep accumulating counts when that is the case.  This involved
  changing the build method to directly build the lists of positive and
  negative counts.  This approach was easier and terser than the
  previous idea of building and interpreting a list of run lengths.  No
  changes to the Curve class were necessary because it already handles
  arbitrary (but monotonic) lists of positive and negative counts.
* Added tests for building curves (rankings) from scores and labels
  where the scores have ties.
* Added corresponding tests for primitives builder.
* Updated dependencies in makefile.
* Updated copyrights and relevant meta-info.
Latest commit 1c5c3d4 Jan 20, 2015

README.md

All About Roc

Description

Roc is software for generating and working with ROC (Receiver Operating Characteristic) and PR (Precision-Recall) curves. These curves are typically used to evaluate classification approaches in areas such as Machine Learning, Statistics, Medicine, and Epidemiology. Students, scientists, and researchers are the target audience of this software.

The goal of this project is to provide software for evaluating ROC and PR curves that correctly implements the traditional and recent approaches using languages allowing for flexibility in each investigator's environment.

Roc, the name of the software, is pronounced "rock" like its namesake, roc, an enormous, legendary bird of prey.

Quick Links

Downloads

License

Roc is free, open source software. It is released under the BSD 2-Clause License (also known as the FreeBSD License). See the file LICENSE.txt in your distribution (or on GitHub) for details.

Features and Project Maturity

Roc is version 0.1.0.

This software is still in the alpha stages of design and development. However, it is mature enough for the authors to use it as part of their everyday workflow.

It is released as a library and as a command-line interface (CLI) front-end for the library. The table below contains a summary of features.

Features are S=stable, T=tested, I=implemented, P=planned, NP=not planned, ?=undecided, NA=not applicable. Languages are J=Java, P2=Python 2.x.

Feature Description           Library Status  CLI Status
-------------------           --------------  ----------
ROC curves
. Points                      J:T  P2:P       J:T  P2:P
. Area                        J:T  P2:P       J:T  P2:P
. Maximum area (convex hull)  J:T  P2:P       J:P  P2:P
. Aggregation (averaging)     J:P  P2:P       J:?  P2:?
. Confidence bounds           J:P  P2:P       J:?  P2:?
. Clipping                    J:P  P2:P       J:?  P2:?
PR curves
. Points                      J:T  P2:P       J:T  P2:P
. Area                        J:T  P2:P       J:T  P2:P
. Maximum area (convex hull)  J:I  P2:P       J:P  P2:P
. Aggregation (averaging)     J:P  P2:P       J:?  P2:?
. Confidence bounds           J:P  P2:P       J:?  P2:?
. Clipping                    J:P  P2:P       J:?  P2:?
. Minimum awareness           J:P  P2:P       J:?  P2:?
Plotting                      J:NP P2:P       J:NP P2:P
Inputs
. Ranking of labels           J:T  P2:P       J:T  P2:P
. Scores, labels              J:T  P2:P       J:T  P2:P
. Score-label pairs           J:P  P2:P       J:T  P2:P
. Example weights             J:P  P2:P       J:P  P2:P
Convenience
. File I/O                    J:P  P2:P       NA
Ranking Statistics
. Mann-Whitney-U              J:T  P2:P       J:?  P2:?

This software is designed and tested to support 1 million total examples. It probably works on many more, but the performance and accuracy have not been tested at such larger scales.

Requirements

  • Java 6 (or later)

Development Requirements

If you want to develop this software, there are some additional requirements.

  • Standard Linux core utilities
  • GNU Make
  • JUnit >= 4.6
  • Hamcrest >= 1.3 (if not already included in your JUnit release)

Note that certain JUnit versions contain some Hamcrest classes and so may conflict with (override) those from Hamcrest. If you encounter missing Hamcrest symbols, try placing Hamcrest ahead of JUnit on the class path or updating JUnit.

Java Library, JAR, CLI

The Java library provides an API for working with ROC and PR curves in your Java programs. It is distributed as a Java archive (JAR) containing source code, bytecode, and documentation. The JAR can be obtained on the releases page. To include the library in your Java project, just place the JAR in a convenient location and include it in your classpath. You can browse the documentation by extracting it from the JAR or by viewing the latest version on GitHub.

The JAR also contains the command-line interface which can be run like this:

java -jar roc-0.1.0.jar --help

Contact

Please search the existing documentation before contacting us. There is this README, the Javadoc, the wiki, and existing issues. Then, open an issue to report a bug or ask a question.

Copyright (c) 2014 Roc Project. This is free software. See LICENSE.txt for details.

You can’t perform that action at this time.