No description, website, or topics provided.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
R
data
man
tests
.travis.yml
DESCRIPTION
NAMESPACE
README.md
slbR.Rproj

README.md

Statistical Learning Benchmarks

Travis-CI Build Status Codecov status

Contents

Overview

In assessing the generalizability of a statistical learning algorithm, it is vital to consider a variety of diverse, feature-rich datasets. In this package, we develop a simple interface to many common benchmark datasets, including the Penn Machine Learning Benchmarks Olson (2017) arXiv:1703.00512, the University of California-Irvine Machine Learning Repository, and MNIST Lecun et al. doi:10.1109/5.726791, allowing users to examine performance across many disparate contexts. Additionally, we provide useful utilities for data cleaning, data preparation, and cross-validation.

Repo Contents

  • R: R package code.
  • docs: package documentation, and usage of the slbR package on many real and simulated data examples.
  • man: package manual for help in R session.
  • tests: R unit tests written using the testthat package.
  • vignettes: R vignettes for R session html help pages.

System Requirements

Hardware Requirements

The slbR package requires only a standard computer with enough RAM to support the operations defined by a user. For minimal performance, this will be a computer with about 2 GB of RAM. For optimal performance, we recommend a computer with the following specs:

RAM: 16+ GB CPU: 4+ cores, 3.3+ GHz/core

The runtimes below are generated using a computer with the recommended specs (16 GB RAM, 4 cores@3.3 GHz) and internet of speed 25 Mbps.

Software Requirements

OS Requirements

The package development version is tested on Linux operating systems. The developmental version of the package has been tested on the following systems:

Linux: Ubuntu 16.04 Mac OSX: Windows:

Installing R version 3.4.2 on Ubuntu 16.04

the latest version of R can be installed by adding the latest repository to apt:

sudo echo "deb http://cran.rstudio.com/bin/linux/ubuntu xenial/" | sudo tee -a /etc/apt/sources.list
gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
gpg -a --export E084DAB9 | sudo apt-key add -
sudo apt-get update
sudo apt-get install r-base r-base-dev

which should install in about 20 seconds.

Installation Guide

Package dependencies

Users should install the following packages prior to installing slbR, from an R terminal:

install.packages(c('readr', 'httr'))

which will install in about 15 seconds on a recommended machine.

If you are having an issue that you believe to be tied to software versioning issues, please drop us an Issue.

Package Installation

From an R session, type:

require(devtools)
install_github('neurodata/slb', force=TRUE)  # install slbR

The package should take approximately 60 seconds to install on a recommended computer.

Demo

As an example, load all classification datasets from the PMLB repository:

library(slb)
data <- slb.load.datasets(repositories="pmlb", task="classiciation")