Skip to content

Native input specification

aL3xa edited this page Jan 30, 2013 · 17 revisions

DISCLAIMER

This document is for developers' reference only and/or for a pair of curious eyes who'd like an early-bird preview of new input specifications and rapport's course of development. Feel free to rant if you think that something we're working on is pure rubbish.


Rationale: old inputs suck because they differ from native R classes. Instead of rolling our own inventions, why not embrace the (implicit) conventions from R ecosystem and build/define inputs according to them? Writing templates should be straightforward and rapport should make you warm and fuzzy inside.

Introduction

Native input specification takes leverage of R objects' classes, thus not relying on custom conventions when defining template inputs. Following specifications are to depict the current state of implementation along with conventions used in it. Unlike old input specification that relied on rather cumbersome custom syntax, new input specification is 100% pure YAML, and as such should be more intuitive and native to R.

Inputs

Inputs can be divided into two categories:

  • dataset inputs (non-standalone) - inputs that match the named element in the object provided in data formal argument in rapport call. This is usually a data.frame object, but can/should be any object that allows subsetting by name. Dataset inputs cannot have the default value, so they always require user input. Note that this is not the same as required input attribute.
  • standalone inputs don't depend on object passed in data. They accept (default) value attribute in the template definition, and user can override that value by passing an appropriate object (of matching class) in the rapport call.

General input options

Following options are available for all inputs (except option, which we'll discuss separately):

  • name (character string) - input name. Cannot be blank, as mapped inputs will be assigned to that name in evaluation environment.

  • label (character string) - input label. Can be blank, but it's useful to have something in there in order to get pretty output in a report (e.g. Number of hours is by far more descriptive than nwhours). Defaults to empty string.

  • description (character string) - input description. It can be blank, but sometimes it's convenient to have a lengthy description of provided input. Defaults to empty string.

  • class (character string) - defines a class of an input (d'uh). It can be omitted (defaults to any), but most of the times you'll find it useful to fine-tune inputs. Currently supported classes are: any (default), character, complex, factor, integer, logical, numeric, option, raw. Apart from option (which is custom, and we may soon replace it with something more native to R), all the others are basic R classes.

  • required (logical value) - is input required (defaults to FALSE). If TRUE, input must match the subset of object provided in data argument (for dataset inputs) or user has to provide a value for that input/default value has to be defined in the template (if standalone).

  • standalone (logical value) - does input depend on provided data. Defaults to FALSE.

  • length - provides set of rules for input's length restriction. It can accept various attributes, e.g.

    • if integer value is provided, it refers to exactly N inputs:
length: 10

which is equivalent to (and will be parsed as):

length:
  exactly: 10
  • from and to attributes can be passed to define a range that input's length must fall into:
length:
  from: 1
  to: 10

from or to can be omitted, and the sane defaults will be set implicitly. For instance, by omitting to, it will default to Inf, and if omitting from it will be set to 1.

length:
  from: 1

is identical to:

length:
  from: 1
  to: Inf

or:

length:
  to: 10

is equivalent to:

length:
  from: 1
  to: 10
  • value (a vector of an equivalent class). Only available for standalone inputs, and must match the class and length of an input.

Class-specific options

character

  • regexp (character value) - a regular expression that all values in the input should match. regexp is omitted by default, and check will be performed only if attribute is non-NULL.
  • nchar - sets restrictions on the number of characters. Accepts the same options as length attribute, only this time number of characters are checked. nchar is omitted from input definition by default. Checks are performed only if attribute is properly defined.

numeric, integer

  • limit (a named list with min and max attributes) - checks if values of numeric/integer inputs fall into range defined by min and max. Both min and max should be length-one vectors of appropriate class. limits are omitted by default. Checks are performed only if limit attribute is provided.

factor

  • nlevels (integer value) - defines number of levels given factor is allowed to have. Attribute is omitted by default and checks will be performed only if non-NULL. NOTE TO SELF: this is lame, nlevels should accept the same format as nchar and/or length

Clone this wiki locally