Statistical Learning Benchmarks
In assessing the generalizability of a statistical learning algorithm, it is vital to consider a variety of diverse, feature-rich datasets. In this package, we develop a simple interface to many common benchmark datasets, including the Penn Machine Learning Benchmarks Olson (2017) arXiv:1703.00512, the University of California-Irvine Machine Learning Repository, and MNIST Lecun et al. doi:10.1109/5.726791, allowing users to examine performance across many disparate contexts. Additionally, we provide useful utilities for data cleaning, data preparation, and cross-validation.
- docs: package documentation, and usage of the
slbRpackage on many real and simulated data examples.
- man: package manual for help in R session.
Runit tests written using the
Rvignettes for R session html help pages.
slbR package requires only a standard computer with enough RAM to support the operations defined by a user. For minimal performance, this will be a computer with about 2 GB of RAM. For optimal performance, we recommend a computer with the following specs:
RAM: 16+ GB CPU: 4+ cores, 3.3+ GHz/core
The runtimes below are generated using a computer with the recommended specs (16 GB RAM, 4 email@example.com GHz) and internet of speed 25 Mbps.
The package development version is tested on Linux operating systems. The developmental version of the package has been tested on the following systems:
Linux: Ubuntu 16.04 Mac OSX: Windows:
Installing R version 3.4.2 on Ubuntu 16.04
the latest version of R can be installed by adding the latest repository to
sudo echo "deb http://cran.rstudio.com/bin/linux/ubuntu xenial/" | sudo tee -a /etc/apt/sources.list gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9 gpg -a --export E084DAB9 | sudo apt-key add - sudo apt-get update sudo apt-get install r-base r-base-dev
which should install in about 20 seconds.
Users should install the following packages prior to installing
slbR, from an
which will install in about 15 seconds on a recommended machine.
If you are having an issue that you believe to be tied to software versioning issues, please drop us an Issue.
R session, type:
require(devtools) install_github('neurodata/slb', force=TRUE) # install slbR
The package should take approximately 60 seconds to install on a recommended computer.
As an example, load all classification datasets from the
library(slb) data <- slb.load.datasets(repositories="pmlb", task="classiciation")