mlr3: Machine Learning in R - next generation
Clone or download

README.md

mlr3

A clean, object-oriented rewrite of mlr.

Build Status CRAN lifecycle codecov

Why a rewrite?

mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.

Design principles

  • Only the basic building blocks for machine learning are implemented in this package.
  • Focus on computation here. No visualization or other stuff. That can go in extra packages.
  • Overcome the limitations of R's S3 classes with the help of R6.
  • Embrace R6, clean OO-design, object state-changes and reference semantics. This might be less "traditional R", but seems to fit mlr nicely.
  • Embrace data.table for fast and convenient data frame computations.
  • Combine data.table and R6, for this we will make heavy use of list columns in data.tables.
  • Once the API is fixed, both advanced techniques and implementations for different learners will be implemented in extra packages to reduce the maintenance burden.
  • Be light on dependencies. mlr3 requires the following packages:
    • mlr3misc Miscellaneous functions used in multiple mlr3 extension packages. Developed by the mlr team. No extra recursive dependencies.
    • R6: Reference class objects. No recursive dependencies.
    • backports: Ensures backward compatibility with older R releases. Developed by members of the mlr team. No recursive dependencies.
    • bit: Efficient storage of logical vectors. No recursive dependencies.
    • checkmate: Fast argument checks. Developed by members of the mlr team. No extra recursive dependencies.
    • data.table: Extension of R's data.frame. No recursive dependencies.
    • digest: Hash digests. No recursive dependencies.
    • logger: Logging facility. No recursive dependencies.
    • Metrics: Package which implements performance measures. No recursive dependencies.
    • paradox Descriptions for parameters and parameter sets. Developed by the mlr team. No extra recursive dependencies.
  • Additional functionality that comes with extra dependencies:
    • For parallelization, mlr3 utilizes the future and future.apply packages.
    • To capture output for logging, evaluate is used to capture output of third party learners. Alternatively, callr starts a new R session to completely isolate the learner from the running session.

State of the project

This package is currently work-in-progress. Do not use in production. The API will change.

Already implemented:

The webpage provides, besides a short introduction, a function reference.

WiP

While mlr3 implements the building blocks for machine learning, some of the advanced features of the monolithic mlr are now shipped in multiple extension packages.