A Java implementation of INFUSE
Java
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.settings
src/main
.checkstyle
.classpath
.gitignore
.project
LICENSE
infuse.jar
pom.xml
readme.md

readme.md

INFUSE

A Java implementation of INFUSE.

The project can be build in eclipse or using maven. A runnable pre-built jar can be found here. When just running the project an example dataset is loaded. However, the following command line arguments are accepted to load other datasets:

  • -json <file> loads a dataset from a single JSON file. The file is assumed to have the structure of example.json.
{
  "featureSelection": [ // defines the order the feature selection algorithms are displayed
    "FeatureSelectionFisherScore", // the names of the algorithms serve as id
    …
  ],
  "classification": [ // all classification results
    {
      "auc": "0.6322", // the area under curve / displayed quality measure
      "fs": "FeatureSelectionFisherScore", // the feature selection algorithm
      "classifier": "ClassificationTree", // the classifier
      "fold": "0" // the cross-validation fold
    },
    …
  ],
  "features": [ // all features
    {
      "ranks": [ // the ranks of this feature
        {
          "rank": null, // this feature was not picked by the algorithm
          "fs": "FeatureSelectionFisherScore", // the feature selection algorithm
          "fold": "0" // the cross-validation fold
        },
        …
        {
          "rank": "87", // this feature was ranked 87th by the algorithm
          "fs": "FeatureSelectionInformationGain", // the feature selection algorithm
          "fold": "0" // the cross-validation fold
        },
        …
      ],
      "subtype": "ProblemList", // the subtype of the feature
      "name": "feature#0", // the name of the feature
      "type": "DIAGNOSIS" // the type of the feature
    },
    …
  ]
}
  • -old <path> <prefix> loads a dataset using the old legacy format.

Optionally, the following argument can be appended in order to anonymize or convert datasets.

  • -anon <output> anonymizes the dataset and saves it to output in the new JSON format.

Pull requests are highly appreciated.