A re-implementation of the minimum redundancy maximum relevance (mRMR) feature selection algorithm with emphasis on greatly increased perfomance (1000x or greater on large data sets) and an improved user interface.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
LICENSE
Makefile
README
attribute_information.hpp
dataset.hpp
matrix.hpp
mrmr.cpp
tests.cpp
typedef.hpp

README

Improved mRMR is a re-implementation of the minimum redundancy maximum relevance (mRMR) feature selection algorithm with emphasis on greatly increased perfomance (1000x or greater on large data sets) and an improved user interface. There are no disadvantages to using this utility as opposed to the original release by Hanchuan Peng, but benefits include:

- results identical to original mRMR implementation by Hanchuan Peng, excluding statistically inconsequential preservation of rank corresponce in the case of metric ties
- incorporation of all improvements from the Fast-MRMR implementation by Sergio Ramírez
- additional performance improvements, such as avoiding computing mutual information for zero-entropy attributes, and careful selection of and usage of data structures
- output for each attribute includes its selection rank, entropy, mutual information with the class attribute, and mRMR score in an easily parsed format friendly to downstream manipulation
- operates directly on original textual data, requiring no transformation into a one-time binary representation
- robust data set parser fails gracefully with bad input and reports the location of the first error
- modular support in the code for arbitrary discretization routines, with several examples already provided and implemented
- support to output the result of parsing and discretization so that it can be verified and analyzed with external tools
- supports stream-based processing, and can operate equally well reading a data set from standard input, in a pipeline, from a named pipe, or with process substitution
- standard GNU getopt_long POSIX-compliant option processing, including full-featured -h/--help capability, informative error messages, graceful failure, and sensible defaults
- high-quality C++14 compliant code base


1. To build, enter project directory and type 'make'.
2. To run from project directory, type './mrmr -h' to get usage information and additional help.