The OpenmlWeka package
Branch: master
Clone or download
Latest commit 36dfbb6 Aug 17, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.settings
dependencies updated weka package Jul 20, 2017
lib made unit tests work Jul 21, 2017
src fixed last unit test Aug 17, 2018
.classpath Update Aug 1, 2017
.gitignore made unit tests work Jul 21, 2017
.project
.travis.yml Update .travis.yml Nov 9, 2017
Description.props updated documentation Jul 24, 2017
PluginManager.props Moved OpenmlWeka to its own repository Apr 25, 2017
README.md
build_instructions.cmd made unit tests work Jul 21, 2017
build_package.xml Minor update May 2, 2017
pom.xml fixed last unit test Aug 17, 2018

README.md

OpenML Weka Connector

License Build Status Coverage Status

Package for uploading Weka experiments to OpenML. Works in combination with the OpenML Apiconnector (available on Maven Central; version >= 1.0.14) and Weka (available on Maven Central; version >= 3.9.0)

Downloading datasets from OpenML

The following code example downloads a specific set of OpenML datasets and loads them into the Weka data format (weka.core.Instances), that can be used trivially for off line development and experimenting.

public static void downloadData() throws Exception {
  // Fill in the API key (obtainable from your OpenML profile)
  String apikey = "<FILL_IN_OPENML_API_KEY>";
  
  // Instantiate the OpenmlConnector object 
  // requires artifact org.openml.apiconnector (version 1.0.14) from Maven central
  OpenmlConnector openml = new OpenmlConnector(apikey);
  
  // Download the OpenML object containing the `OpenML100' benchmark set
  Study s = openml.studyGet("OpenML100", "data");
  
  // Loop over all the datasets
  for (Integer dataId : s.getDataset()) {
    // DataSetDescription is an OpenML object containing meta-information about the dataset
    DataSetDescription dsd = openml.dataGet(dataId);
    
    // datasetFile downloads the raw dataset file from openml
    File datasetFile = dsd.getDataset(apikey);
    
    // Converts this file into the Weka format
    Instances dataset = new Instances(new FileReader(datasetFile));
    System.out.println("Downloaded " + dsd.getName());
    System.out.println("numObservations = " + dataset.numInstances() + "; numFeatures = " + dataset.numAttributes());
  }
}

Uploading Weka experiments

The following code example downloads a specific set of OpenML tasks (dubbed: the OpenML100) and executes a NaiveBayes classifier on it.

public static void runTasksAndUpload() throws Exception {
  // Fill in the API key (obtainable from your OpenML profile)
  String apikey = "<FILL_IN_APIKEY>";
  
  // The WekaConfig module gives us the possibilities to enable or disable various Weka Specific options
  WekaConfig config = new WekaConfig();
  
  // Instantiate the OpenmlConnector object 
  // requires artifact org.openml.apiconnector (version >= 1.0.14) from Maven central
  OpenmlConnector openml = new OpenmlConnector(apikey);
  
  // Download the OpenML object containing the `OpenML100' benchmark set
  Study s = openml.studyGet("OpenML100", "tasks");
  
  // Loop over all the tasks
  for (Integer taskId : s.getTasks()) {
    // create a Weka classifier to run on the task
    Classifier tree = new NaiveBayes();
    
    // execute the task (can take a while, depending on the classifier / dataset combination)
    int runId = RunOpenmlJob.executeTask(openml, config, taskId, tree);
    
    // After several minutes, the evaluation measures will be available on the server
    System.out.println("Available on " + openml.getApiUrl() + "run/" + runId);
    
    // Download the run from the server:
    Run run = openml.runGet(runId);
  }
}

Obtaining experimental results from OpenML

OpenML contains a large number of experiments, conveniently available for everyone. In order to obtain and analyse these results, the OpenML Apiconnector could be of use. Please follow the demonstration depicted on the respective Github page.

How to cite

If you found this package useful, please cite: J. N. van Rijn, Massively Collaborative Machine Learning, Leiden University, 2016. If you used OpenML in a scientific publication, please check out the OpenML citation policy.