mlr: Machine Learning in R
Switch branches/tags
1141 AnomalyDetection_Work Multilabel_ROC_plots add_fda_section allow_formulas_in_tasks ame appveyor benchmark_openml cache-filtering checkmate_devel clean_up confusionmatrix.iss.1680 convertOutlierScoresIntoProbEstimates convertScoresIntoProb_new convertScoresIntoProb datasource debug-rcmdcheck deleteme eda_rewrite facet-nested fda_classif_FDboost fda_lte_tsfeatures fda_lte fda_pull1_task_backup fda_pull1_task_featExtract fda_pull2_fdausclearners fda_pull2_fix_tests fda_pull2_fix_wavelets fda_pull2_fixTasks fda feature_multiclass_upsampling fix-2318 fix-2372 fix-create-resamp-plots fix-measures-doc fix_irace fix.iss1515.NA.testset fix_learners_all fix_no_tests_on_cran fix_urls fix_344_default_par_vals fix_525_getClassWeightParam fix_718_mlrMBO_integration fix1459_getDefaultMeasure fix1588_subsetting fix_1738_svm_no_formula fix_1839_reenable_tests fix_1940_blackboost fix_1957 fix_2092_no_check_data_in_cluster_task fix2183_allow_no_learner fix2184_tuneMBO_parallel_threashold fix#2362 fix_2472 fixmultilabelid florianfendt-featimp forecasting_base forecasting fs-ensemble gbm_ntrees_iss1051 gh-pages growing_window_cv gsoc-khypers hypers-pd impact_encoding improve.ResamplePrediction iss1057SubsetTask iss1962.separate.help.pages issue_1898_defer_package_load learners_all_surv learners_all_surv2 lqa makeTuneMultiCritControlMBO master mboost.families measures_update multioutput mxnet newfun_getLearnerId_and_rename_setId_to_setLearnerId oneclass_RLearner_h2o oneclass_RLearner_ksvm oneclass_RLearner_svm oneclass_base oneclass_benchmark_models oneclass_calculateAUMVC oneclass_calculateAUMVC_2 oneclass_lof_dbscan oneclass_lof oneclass_newMeasurements oneclass_oc_resampling oneclass_prob_scaling_plot oneclass_prob_scaling oneclass_tunethreshold ordinal_multioutput ordinal_regression partykit-transition plot_spatial_partitions pr_1943_suggestion prederr ranger_case_weights ranger_sd_estim rcmdcheck-debug reenable_classiflabelswitch reinstate_xyf release_branch_2_13 roxygen2-fix scaling_km setBMRThreshold silent-namespace stacking_branch_florian stacking_branch stacking_florian styler survival_measures survival_probabilities_ibrier task_help_pages tasktransform test-bioc test-travis-limits test_learner_defaults testDefaults tic travis_test_deploy travis tuneThreshold_multiclass_fix tuning_methods tutorial_pdf_dev tutorial_pdf_devel tutorial_pdf_release tutorial_pdf update_namespace vignette-seeding
Nothing to show
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.github Enabling github templates for Issues and PRs (#1893) Aug 10, 2017
R Replace purrr functions with base/BBmisc (#2491) Nov 19, 2018
addon tutorial: fix navbar logo Sep 24, 2018
data Base: Add createSpatialResamplingPlots() (#2373) Jul 17, 2018
docs Deploy from Travis build 13248 [ci skip] Dec 14, 2018
inst Update master to latest CRAN version, travis and tic tweaks (#2429) Sep 13, 2018
man-roxygen Remove ggvis functions (#2202) Mar 2, 2018
man Travis: Re-enable auto-deployment of docs (#2480) Nov 10, 2018
pkgdown Add inline ToC to vignettes (#2440) Sep 27, 2018
src Appveyor: Remove "package installation workaround" (#2464) Oct 24, 2018
tests fix test_regr_laGP.R (#2505) Dec 14, 2018
thirdparty Move from mkdocs to pkgdown and integrate mlr-tutorial into mlr (#2123) Mar 12, 2018
todo-files Re-add missing dependencies (#2295) Jun 26, 2018
vignettes pkgdown: account for 6fdcc62 and 52851bf Nov 19, 2018
.Rbuildignore Update master to latest CRAN version, travis and tic tweaks (#2429) Sep 13, 2018
.editorconfig added editorconfig Sep 3, 2014
.gitignore Tutorial fixes (#2344) Jun 29, 2018
.ignore renamed ignore file [ci skip] Jan 13, 2017
.travis.yml Travis: Re-enable auto-deployment of docs (#2480) Nov 10, 2018
DESCRIPTION Replace purrr functions with base/BBmisc (#2491) Nov 19, 2018
LICENSE Update master to latest CRAN version, travis and tic tweaks (#2429) Sep 13, 2018
NAMESPACE Deploy from Travis build 13187 [ci skip] Nov 19, 2018
NEWS update auto-generated documentation [ci skip] Nov 10, 2018
NEWS.md clean NEWS Nov 12, 2018
README.md README: update URL to blog Nov 6, 2018
_pkgdown.yml http -> https Oct 9, 2018
appveyor.yml Remove _R_CHECK_FORCE_SUGGESTS_ from Travis (#2467) Nov 4, 2018
mlr.Rproj Update mlr.Rproj, update man/ (#2336) Jun 27, 2018
tic.R pkgdown 1.2 broken for tables (r-lib/pkgdown#910), we need the dev ve… Nov 22, 2018

README.md

Machine Learning in R

Build Status Build status CRAN cran checks CRAN Downloads StackOverflow lifecycle

devtools::install_github("mlr-org/mlr")

mlr - How to Cite and Citing Publications

Please cite our JMLR paper [bibtex].

Some parts of the package were created as part of other publications. If you use these parts, please cite the relevant work appropriately. An overview of all mlr related publications can be found here.

A list of publications that cite mlr can be found in the wiki.

Introduction

R does not define a standardized interface for all its machine learning algorithms. Therefore, for any non-trivial experiments, you need to write lengthy, tedious and error-prone wrappers to call the different algorithms and unify their respective output.

Additionally you need to implement infrastructure to resample your models, optimize hyperparameters, select features, cope with pre- and post-processing of data and compare models in a statistically meaningful way. As this becomes computationally expensive, you might want to parallelize your experiments as well. This often forces users to make crummy trade-offs in their experiments due to time constraints or lacking expert programming skills.

mlr provides this infrastructure so that you can focus on your experiments! The framework provides supervised methods like classification, regression and survival analysis along with their corresponding evaluation and optimization methods, as well as unsupervised methods like clustering. It is written in a way that you can extend it yourself or deviate from the implemented convenience methods and construct your own complex experiments or algorithms.

Furthermore, the package is nicely connected to the OpenML R package and its online platform, which aims at supporting collaborative machine learning online and allows to easily share datasets as well as machine learning tasks, algorithms and experiments in order to support reproducible research.

Features

  • Clear S3 interface to R classification, regression, clustering and survival analysis methods
  • Possibility to fit, predict, evaluate and resample models
  • Easy extension mechanism through S3 inheritance
  • Abstract description of learners and tasks by properties
  • Parameter system for learners to encode data types and constraints
  • Many convenience methods and generic building blocks for your machine learning experiments
  • Resampling methods like bootstrapping, cross-validation and subsampling
  • Extensive visualizations for e.g. ROC curves, predictions and partial predictions
  • Benchmarking of learners for multiple data sets
  • Easy hyperparameter tuning using different optimization strategies, including potent configurators like iterated F-racing (irace) or sequential model-based optimization
  • Variable selection with filters and wrappers
  • Nested resampling of models with tuning and feature selection
  • Cost-sensitive learning, threshold tuning and imbalance correction
  • Wrapper mechanism to extend learner functionality in complex and custom ways
  • Combine different processing steps to a complex data mining chain that can be jointly optimized
  • OpenML connector for the Open Machine Learning server
  • Extension points to integrate your own stuff
  • Parallelization is built-in
  • Unit-testing
  • Detailed tutorial

News

Changes of the packages can be accessed in the NEWS file shipped with the package.

Get in Touch

Please use the issue tracker for problems, questions and feature requests. Don't email in most cases, as we forget these mails.

We also do not hate beginners and it is perfectly valid to mark an issue as "Question". However, simple usage questions are better suited at Stackoverflow using the 'mlr' tag.

Please don't forget that all of us work in academia and put a lot of work into this project, simply because we like it, not because we are specifically paid for it.

We also welcome pull requests or new developers. Just make sure that you have a glance at our mlr coding guidelines before.

mlr-tutorial

With the start of v2.13 we switched from mkdocs to pkgdown. All source files are now located in this repo under vignettes/.

Modification of a tutorial section:

If you want to modify/add a tutorial section, please follow these steps:

  1. Open the respective source file, e.g. task.Rmd.
  2. Follow the style guide while editing:
    • Reference mlr functions as <function()>, e.g. makeLearner().
    • Reference external functions as package::function(), e.g. kernlab::ksvm().
    • Reference other tutorial pages with <name_of_vignette>.html, e.g. [bagging](bagging.html).
    • Always start a new sentence with a new line.
    • If you want to insert a paragraph, skip one line.
    • Always insert exactly one empty line before and after a code chunk, header, figure or a table.
    • Referencing images is a bit tricky since we need to ensure that they look good in both the HTML and PDF version. Put your image into vignettes/tutorial/devel/pdf/img/ and see the examples in resampling.Rmd, nested_resampling.Rmd or handling_of_spatial_data.Rmd.
  3. Make sure that the .Rmd file is working on its own, i.e. compile it as a single file (preferably using build_article("<vignette-name>")) and see if everything works. Put required packages in the setup chunk at the beginning of the tutorial.

Rendering the tutorial locally:

If you want to view the complete pkgdown site locally, run pkgdown::build_site(lazy = TRUE). You don't have to render the complete site every time you change one tutorial. The lazy = TRUE argument ensures that only pages are rebuilt that have changed. Also, if you have built the whole site once, you can just build the vignettes again by using build_articles(lazy = TRUE). More specific, if you are working on one vignette, you can run build_article("<vignette-name>"). You do not need to pass the .Rmd extension when using build_article().

Important: Do not commit any file in docs/ as the rendering will be done by Travis!

Adding a new section:

Edit _pkgdown.yml and add the new section at the appropriate place.

Issues and Pull Requests:

If you want to open an issue or pull request that is related to mlr-tutorial, label it with tutorial and mention jakob-r or pat-s if you need help.