Skip to content

Roxygen Guide

Michel Lang edited this page Oct 27, 2019 · 21 revisions

There has been no official way to document R6 classes using roxygen2 as we started mlr3. This style guide describes our agreement on how to document R6 classes for mlr3 packages with roxygen/markdown.

Examples can be found in the function reference, the corresponding source files are linked at the top of the respective manual pages.



The title should be in title case (c.f. tools::toTitleCase()):

  • All principal words capitalized
  • Articles, conjunctions or prepositions are all lowercase
#' @title Regression Task


The usage section for R6 objects is not useful and should be suppressed.

#' @usage NULL


The format specification.

  • Base classes:

    #' @format [R6::R6Class] object.
  • Derived classes:

    #' @format [R6::R6Class] object inheriting from [Task]/[TaskSupervised].


If the class inherits from another class which is implemented in the same package but in a different file, it is required to control the collation order:

#' @include TaskSupervised.R


Describes how to instantiate the class. To mimick R's "Usage", this section should start with a code block (three backticks) where an exemplary assignment to some variable with a short name is done:

#' @section Construction:

#' ```
#' t = TaskSupervised$new(id, task_type, backend, target)
#' ```

In the following, refer to the instance as t, e.g. t$id.

Next, each argument of the constructor needs to be documented in an itemize environment. The first line of each bullet point consists of the argument name in backticks, followed by the separator "::" and the expected type of the argument. If the argument is of dynamic type, try to list all possible input types. Use additional lines to further describe the purpose of the argument.

#' * `id` :: `character(1)`\cr
#'   Name of the task.
#' * `backend` :: ([DataBackend] | `data.frame()` | ...)\cr
#'   Either a [DataBackend], or any object which is convertible to a DataBackend with `as_data_backend()`.
#'   E.g., a `data.frame()` will be converted to a [DataBackendDataTable].


Describe all fields of the R6 class, in a bullet list. This is done in the same format as the arguments in the constructor.

#' @section Fields:
#' * `formula` :: `formula()`\cr
#'   Constructs a [stats::formula], e.g. `[target] ~ [feature_1] + [feature_2] + ... + [feature_k]`, using
#'   the active features of the task.


Describes public methods, again as a bullet list. The first line in each bullet point specifies the signature of the method. The second line contains the classes of all arguments in backticks (linked if not default R package classes; possibly with the length given in parentheses if vector type). Alternative types are separated by vertical bars (|), arguments are comma-separated, all surrounded by parentheses. Spaces before and after |, after ,, but not after ( or before ) or ,. The signature definitions ends with an arrow (->), followed by return type. Use return type self if the method is a mutator. Functions without input ("nullary functions") have () in-type, functions with invisible(NULL) (e.g. print, plot) have `NULL` return type.

#' * `missings(cols = NULL)`\cr
#'   `character()` -> named `integer()`\cr
#'   Returns the number of missing values observations for each columns in `cols`.
#'   Argument `cols` defaults to all columns with role "target" or "feature".
#' * `data(rows = NULL, cols = NULL, format = NULL)`\cr
#'   (`integer()` | `character()`, `character()`, `character(1)`) -> `any`\cr
#'   Returns a slice of the data from the [DataBackend] in the format specified by `format`
#'   (depending on the [DataBackend], but usually a [data.table::data.table()]).
#'   Rows are subsetted to only contain observations with role "use".
#'   Columns are filtered to only contain features with roles "target" and "feature".
#'   If invalid `rows` or `cols` are specified, an exception is raised.

Fields and Methods of Superclasses

Do not re-documents fields or methods of a superclass you inherit from in your current class. Not manually, not by using an "inherit*" roxygen tag.

S3 Methods

Describes S3 methods which are applicable on instantiated objects. Uses the same format as regular methods.

#' * ``\cr
#'   [Task] -> [data.table::data.table()]\cr
#'   Returns the data set as `data.table()`.


If there are multiple related classes, it is advised to group them together into families:

#' @family Task

Each member of the family will get cross-references to other family members in a "See also" section.


Export only if you want to expose the class to the user. If the class is intended to be used by another package, but not by the user, add the keyword internal to hide it from the documentation overview:

#' @export
#' @keywords internal


Keep them simple and fast. Also consider inserting some comments, but don't overdo this. A good example has nearly self-documenting code with a comment header for each 3-5 line section. If you need to load other suggested packages in an example, load them with library() in the beginning of the example and do not always refer to them with ::.

#' # construct a new task
#' task = Task$new("iris", task_type = "classif", backend = iris)
#' # retrieve the number of rows
#' task$nrow
#' # retrieve the number of columns
#' task$ncol

Text Style Guide

  • Refer to fields and methods with a prepended dollar sign to avoid confusion with regular variables or functions ($id). If you have introduced a new variable in the constructor, use it before the dollar (t$id) to easily disambiguate objects.
  • Refer to methods with a prepended dollar and an appended empty pair of parentheses to avoid confusion with regular variables or functions (t$missings()).
  • Make use of explicit links for:
    • Every function that is not an R6 method.
    • Any type / class that is not in R default packages. Don't link things like character(1) or character().
    • Every type, class, function that is not in R default packages in a member variable type description or function type description.


See Task.R.