R Makefile
Switch branches/tags
Nothing to show

README.md

CRAN Task View: Package Development

Do not edit this README by hand. See CONTRIBUTING.md.

----------------- ------------------------------------------------------
Maintainer: Thomas J. Leeper
Contact: thosjleeper at gmail.com
Version: 2017-06-16
URL: https://CRAN.R-project.org/view=PackageDevelopment

Packages provide a mechanism for loading optional code, data, and documentation as needed. At the very minimum only a text editor and an R installation are needed for package creation. Nonetheless many useful tools and R packages themselves have been provided to ease or improve package development. This Task View focuses on these tools/R packages, grouped by topics.

The main reference for packages development is the "Writing R Extension" manual. For further documentation and tutorials, see the "Related links" section below.

If you think that some packages or tools are missing from the list, feel free to e-mail me or contribute directly to the Task View by submitting a pull request on GitHub.

Many thanks to Christopher Gandrud, Cristophe Dutang, Darren Norris, Dirk Eddelbuettel, Gabor Grothendieck, Gregory Jefferis, John Maindonald, Luca Braglia, Spencer Graves, Tobias Verbeke, and the R-core team for contributions.

First steps

Searching for Existing Packages

Before starting a new package it's worth searching for already available packages, both from a developer's standpoint ("do not reinvent the wheel") and from a user's one (many packages implementing same/similar procedures can be confusing). If a package addressing the same functionality already exists, you may consider contributing at it instead of starting a new one.

  • utils::RSiteSearch() allows to search for keywords/phrases in help pages (all the CRAN packages except those for Windows only and some from Bioconductor), vignettes or task views, using the search engine at http://search.r-project.org/. A convenient wrapper around RSiteSearch that adds hits ranking is findFn() function, from sos.
  • RSeek allows to search for keywords/phrases in books, task views, support lists, function/packages, blogs etc.
  • Rdocumentation allows to search for keywords/phrases in help pages for all CRAN and some Bioconductor/GitHub packages. RDocumentation (GitHub) provides an R client for the site.
  • Crantastic! maintains an up-to-date and tagged directory of packages on CRAN. The Managed R Archive Network from Revolution Analytics is a CRAN mirror that additionally provides visualizations of package dependency trees.
  • http://www.r-pkg.org/ is an unofficial CRAN mirror that provides a relatively complete archive of package and read-only access to package sources on Github.
  • CRANberries provides a feed of new, updated, and removed packages for CRAN.
  • If you're looking to create a package, but want ideas for what sorts of packages are in demand, the rOpenSci maintains a wishlist for science-related packages and a TODO list of web services and data APIs in need of packaging.

Initializing an R package

  • utils::package.skeleton() automates some of the setup for a new source package. It creates directories, saves functions, data, and R code files provided to appropriate places, and creates skeleton help files and a Read-and-delete-me file describing further steps in packaging
  • create() from devtools is similar to package.skeleton except it allows to specify DESCRIPTION entries and doesn't create source code and data files from global environment objects or sourced files.
  • Non-devtools alternatives also exist. kitten() from pkgKitten allows one to specify the main DESCRIPTION entries and doesn't create source code and data files from global environment objects or sourced files. It's used to initialize a simple package that passes R CMD check cleanly. skeletor provides another non-devtools skeleton-building function with a wider set of defaults and options.
  • mason provides a fun, interactive tool for creating a package based on a variety of inputs.
  • Rcpp.package.skeleton() from Rcpp adds to package.skeleton the C++ via Rcpp handling, by modifying eg. DESCRIPTION and NAMESPACE accordingly, creating examples if needed and allowing the user to specify (with a character vector of paths) which C++ files to include in src directory . Finally the user can decide main DESCRIPTION entries.
  • mvbutils provides a variety of useful functions for development which include tools for managing and analyzing the development environment, auto-generating certain function types, and visualizing a function dependency graph. pagerank (not on CRAN) can calculate a package's PageRank from its dependency graph.
  • swagger (not on CRAN) uses the Swagger JSON web service API specification to automatically generate an R client package for a web service API.

R packages require a Version string in the DESCRIPTION file. Traditionally, packages have been versioned using a MAJOR.MINOR-PATCH format, sometimes using the version's date as the PATCH component. More recently, semantic versioning has become common. semver (GitHub) provides tools to parse and manipulate semantic version strings.

When initializing a package, it is worth considering how it should be licensed. CRAN provides a list of the most commonly used software licences for R packages. osi (GitHub) provides a more comprehensive list in a standardized format.

Programming Paradigms

R is foremost a functional programming language with dynamic typing, but has three built-in forms of object-oriented programming as well as additional object-oriented paradigms available in add-on packages.

  • The built-in S3 classes involve wherein a generic function (e.g., summary) employs a distinct method for an object of a given class (i.e., it is possible to implement class-specific methods for a given generic function). If a package implements new object classes, it is common to implement methods for commonly used generics such as print, summary, etc. These methods must be registered in the package's NAMESPACE file. R.methodsS3 aims to simplify the creation of S3 generic functions and S3 methods.
  • S4 is a more formalized form of object orientation that is available through methods. S4 classes have formal definitions and can dispatch methods based on multiple arguments (not just the first argument, as in S3). S4 is notable for its use of the @ symbol to extract slots from S4 objects. John Chambers's "How S4 Methods Work" tutorial may serve as a useful introduction.
  • Reference classes were introduced in R2.12.0 and are also part of methods. They offer a distinct paradigm from S3 and S4 due to the fact that reference class objects are mutable and that methods belong to objects, not generic functions.
  • aoos and R.oo are other packages facilitating object-oriented programming. R6 (Github) provides an alternative to reference classes without a dependency on methods.
  • proto provides a prototype-based object orientated programming paradigm.
  • rtype provides a strong type system.
  • argufy (Not on CRAN), provides a syntax for creating functions with strictly typed arguments, among other possible checks.
  • lambda.r, lambdaR (not on CRAN), and purrr provide interfaces for creating lambda (anonymous) functions.
  • functools (GitHub) provides higher-order functions (Map, Reduce, etc.) common in funcitonal programming.

Another feature of R is the ability to rely on both standard and non-standard evaluation of function arguments. Non-standard evaluation is seen in commonly used functions like library and subset and can also be used in packages.

  • substitute() provides the most straightforward interface to non-standard evaluation of function arguments.
  • lazyeval (Github) aims to help developers design packages with parallel function implementations that follow both standard and non-standard evaluation.
  • An increasingly popular form of non-standard evaluation involves chained expressions or "pipelines". magrittr provides the %>% chaining operator that passes the results of one expression evaluation to the next expression in the chain, as well as other similar piping operators. pipeR offers a larger set of pipe operators. assertr and ensurer provide (fairly similar) testing frameworks for pipelines.

Dependency Management

Packages that have dependencies on other packages need to be vigilant of changes to the functionality, behaviour, or API of those packages.

  • backports (GitHub) provides reimplementations of functions added to base R packages since v3.0.0, making them available in older versions of R. This gives package developers the ability to reduce or eliminate a dependency on specific versions of R itself.
  • packrat (GitHub) provides facilities for creating local package repositories to manage and check dependencies.
  • checkpoint relies on the Revolution Analytics MRAN repository to access packages from specified dates.
  • pacman (GitHub) can install, uninstall, load, and unload various versions of packages from CRAN and Github.
  • GRANBase (GitHub) provides some sophisticated tools for managing dependencies and testing packages conditional on changes.
  • Two packages currently provide alternative ways to import objects from packages in non-standard ways (e.g., to assign those objects different names from the names used in their host packages). import (GitHub) can import numerous objects from a namespace and assign arbitrary names. modules (not on CRAN) provides functionality for importing alternative non-package code from Python-like "modules".
  • Rocker is an initiative to create Docker configurations for R and packages. containerit (not on CRAN) can be used to package an R workspace and all dependencies as a Docker container.

Source Code

Foreign Languages Interfaces

  • R's base functions system(), system2(), and - on Windows - shell.exec() - provide interfaces for calling system functions. sys (GitHub) and processx (not on CRAN) provide unified, platform-independent APIs for running system processes.
  • inline eases adding code in C, C++, or Fortran to R. It takes care of the compilation, linking and loading of embedded code segments that are stored as R strings.
  • Rcpp offers a number of C++ classes that makes transferring R objects to C++ functions (and back) easier. RInside provides C++ classes for embedding within C++ applications.
  • rGroovy integrates with the Groovy scripting language.
  • rJava provides a low-level interface to Java similar to the .Call interface for C and C++. helloJavaWorld provides an example rJava-based package. jvmr (archived on CRAN) provides a bi-directional interface to Java, Scala, and related languages, while rscala is designed specifically for Scala.
  • rustr provides bindings to Rust.
  • reach (not on CRAN) and matlabr provide rough interfaces to Matlab.
  • rPython, rJython, PythonInR, rpy2 (not on CRAN), and SnakeCharmR (not on CRAN) provide interfaces to python. reticulate (GitHub) is a relatively recent interface built by RStudio.
  • RJulia (not on CRAN) provides an interface with Julia, as does XRJulia. RCall embeds R within Julia.
  • RStata is an interface with Stata. RCall embeds R in Stata.
  • tcltk, which is a package built in to R, provides an general interface to Tcl, usefully especially for accessing Tcl/tk (for graphical interfaces). after (not on CRAN) uses tcltk to run R code in a separate event loop.
  • V8 offers an embedded Javascript engine, useful for building packages around Javascript libraries. js provides additional tools for working with Javascript code.

The knitr package, which supplies various foreign language engines, can also be used to generate documents that call python, awk, ruby, haskell, bash, perl, dot, tikz, sas, coffeescript, and polyglot.

Writing packages that involve compiled code requires a developer toolchain. If developing on Windows, this requires Rtools, which is updated with each R minor release.

Debugging

  • log4r (Github) and logging provide logging functionality in the style of log4j.
  • loggr (not on CRAN) aims to provide a simplified logging interface without the need for withCallingHandlers() expressions.
  • rollbar reports messages and errors to Rollbar, a web service.
  • The rchk tool provides tools for identifying memory-protection bugs in C code, including base R and packages.

Code Analysis and Formatting

  • codetools provides a number of low-level functions for identifying possible problems with source code.
  • lintr provides tools for checking source code compliance with a style guide.
  • formatR and rfmt (not on CRAN) can be used to neatly format source code.
  • FuncMap provides a graphical representation of function calls used in a package.
  • pkggraph (GitHub) and functionMap provide tools useful for understanding function dependencies within and across packages. atomize can quickly extract functions from within a package into their own package.

Profiling

  • Profiling data is provided by utils::Rprof() and can be summarized by utils::summaryRprof(). prof.tree (GitHub) provides an alternative output data structure to Rprof(). profmem (GitHub) adds a simple interface on top of this.
  • profr can visualize output from the Rprof interface for profiling.
  • proftools and aprof can also be used to analyse profiling output.
  • profvis (not on CRAN) provides an interactive, graphical interface for examining profile results.
  • lineprof (not on CRAN) provides a visualization tool for examining profiling results.
  • Rperform (not on CRAN) compares package performance across different git versions and branches.

Benchmarking

  • base::system.time() is a basic timing utility that calculates times based on one iteration of an expression.
  • microbenchmark and rbenchmark provide timings based on multiple iterations of an expression and potentially provide more reliable timings than system.time()

Unit Testing

  • Packages should pass all basic code and documentation checks provided by the R CMD check quality assurance tools built in to R. rcmdcheck provides programmatic access to R CMD check from within R and callr (not on CRAN) provides a generic interface for calling R from within R.
  • R documentation files can contain demonstrative examples of package functionality. Complete testing of correct package performance is better reserved for the test directory. Several packages provide testing functionality, including RUnit, svUnit, testit (GitHub), testthat, testthatsomemore (not on CRAN), and pkgmaker. runittotestthat provides utilities for converting exiting RUnit tests to testthat tests.
  • unitizer (GitHub) provides tools for regression testing by comparing test output against previous runs of the same tests. The package has extensive vignette-based documentation.
  • vdiffr (GitHub) can be used for graphical unit tests.
  • assertive, assertr, checkmate ensurer, and assertthat provide test-like functions for use at run-time or in examples that will trigger messages, warnings, or errors if an R object differs from what is expected by the user or developer.
  • mockr (GitHub) provides tools to mock a function in a test context. This can be useful for standardizing the test of a function that calls other functions by fixing the output of those function dependencies.
  • covr and testCoverage (not on CRAN) offer utilities for monitoring how well tests cover a package's source code. These can be complemented by services such as Codecov or Coveralls that provide web interfaces for assessing code coverage.
  • withr (GitHub) provides functions to evaluate code within a temporarily modified global state, which may be useful for unit testing, debugging, or package development.
  • The devtools::use_revdep() and revdep_check() functions from devtools can be used to test reverse package dependencies to ensure code changes have not affected downstream package functionality. crandalf (not on CRAN) provides an alternative mechanism for testing reverse dependencies.

Internationalization and Localization

  • There is no standard mechanism for translation of package documentation into languages other than English. To create non-English documentation requires manual creation of supplemental .Rd files or package vignettes. Packages supplying non-English documentation should include a Language field in the DESCRIPTION file.
  • R provides useful features for the localization of diagnostic messages, warnings, and errors from functions at both the C and R levels based on GNU gettext. "Translating R Messages" describes the process of creating and installing message translations.

Creating Graphical Interfaces

  • For simple interactive interfaces, readline() can be used to create a simple prompt. getPass provides cross-platform mechanisms for securely requesting user input without displaying the intput (e.g., for passwords). utils::menu(), utils::select.list() can provide graphical and console-based selection of items from a list, and utils::txtProgressBar() provides a simple text progress bar.
  • tcltk is an R base package that provides a large set of tools for creating interfaces uses Tcl/tk (most functions are thin wrappers around corresponding Tcl and tk functions), though the documentation is sparse. tcltk2 provides additional widgets and functionality. qtbase provides bindings for Qt. RGtk (not on CRAN) provides bindings for Gtk and gnome. gWidgets2 offers a language-independent API for building graphical user interfaces in Gtk, Qt, or Tcl/tk.
  • fgui can create a Tcl/tk interface for any arbitrary function.
  • shiny provides a browser-based infrastructure for creating dashboards and interfaces for R functionality. htmlwidgets is a shiny enhancement that provides a framework for creating HTML widgets.
  • progress (Github) offers progress bars for the terminal, including a C++ API.

Command Line Argument Parsing

Using Options in Packages

  • pkgconfig (GitHu) allows developers to set package-specific options, which will not affect options set or used by other packages.

Documentation

Writing Package Documentation

Package documentation is written in a TeX-like format as .Rd files that are stored in the man subdirectory of a package. These files are compiled to plain text, HTML, or PDF by R as needed.

  • One can write .Rd files directly. A popular alternative is to rely on roxygen2, which uses special markup in R source files to generate documentation files before a package is built. This functionality is provided by roxygen2::roxygenise() and underlies devtools::document(). roxygen2 eliminates the need to learn some of the formatting requirements of an .Rd file at the cost of adding a step to the development process (the need to roxygenise before calling R CMD build). Recent versions of roxygen2 support full markdown-based documentation without the need for any native Rd formatting.
  • Rd2roxygen can convert existing .Rd files to roxygen source documentation, facilitating the conversion of existing documentation to an roxygen workflow. roxygen2md (not on CRAN) provides tools for further converting Rd markup within roxygen comments to markdown format (supported by the latest versions of roxygen2).
  • roxyPackage (not on CRAN) provides some additional functionality for maintaining package documentation.
  • inlinedocs and documair provide further alternative documentation schemes based on source code commenting.
  • tools::parse_Rd() can be used to manipulate the contents of an .Rd file. tools::checkRd() is useful for validating an .Rd file. Duncan Murdoch's "Parsing Rd files" tutorial is a useful reference for advanced use of R documentation. Rdpack provides additional tools for manipulating documentation files.
  • packagedocs and pkgdown (not on CRAN) can be used to generate static websites from R documentation files.

Writing Vignettes

Package vignettes provides additional documentation of package functionality that is not tied to a specific function (as in an .Rd file). Historically, vignettes were used to explain the statistical or computational approach taken by a package in an article-like format that would be rendered as a PDF document using Sweave. Since R 3.0.0, non-Sweave vignette engines have also been supported, including knitr, which can produce Sweave-like PDF vignettes but can also support HTML vignettes that are written in R-flavored markdown. To use a non-Sweave vignette engine, the vignette needs to start with a code block indicating the package and function to be used:

% %

Spell Checking

Data in Packages

  • lazyData offers the ability to use data contained within packages that have not been configured using LazyData.

Tools and Services

Text Editors and IDEs

Makefiles

  • GNU Make is a tool that typically builds executable programs and libraries from source code by reading files called Makefile. It can be used to manage R package as well; maker is a Makefile completely devoted to R package development based on makeR.
  • remake (not on CRAN) provides a yaml-based, Makefile-like format that can be used in Make-like workflows from within R.

Version Control

  • R itself is maintained under version control using Subversion.
  • Many packages are maintained using git, particularly those hosted on GitHub. git2r (Github) provides bindings to libgit2 for programmatic use of git within R.

Hosting and Package Building Services

Many hosting services are available. Use of different hosts depends largely on what type of version control software is used to maintain a package. The most common sites are:

  • R-Forge, which relies on Subversion. Rforge.net is another popular Subversion-based system.
  • r-hub is a modern package test service funded by the RConsortium . rhub (not on CRAN) provides an R client for the site's API.
  • GitHub mainly supports Git and Mercurial. Packages hosted on Github can be installed directly using devtools::install_github() or ghit::install_github() from ghit or remotes::install_github() from remotes. gh (not on CRAN) is a lightweight client for the GitHub API. githubtools (not on CRAN) provides some resources for including GitHub-related links in package documentation and for analyzing packages installed from GitHub.
  • Bitbucket is an alternative host that provides no-cost private repositories and GitLab is an open source alternative. gitlabr provides is an API client for managing Gitlab projects.
  • Github supports continuous integration for R packages. Travis CI is a popular continuous integration tools that supports Linux and OS X build environments. Travis has native R support, and can easily provide code coverage information via covr to Codecov.io or Coveralls. travisci (not on CRAN) provides an API client for Travis. Use of other CI services, such as Circle CI may require additional code and examples are available from r-travis and/or r-builder. circleci (not on CRAN) provides an API client for Circle CI. badgecreatr (GitHub) provides a convenient way of creating standardized badges (or "shields") for package READMEs.
  • WinBuilder is a service intended for useRs who do not have Windows available for checking and building Windows binary packages. The package sources (after an R CMD check) can be uploaded via html form or passive ftp in binary mode; after checking/building a mail will be sent to the Maintainer with links to the package zip file and logs for download/inspection. Appveyor is a continuous integration service that offers a Windows build environment. r-appveyor (not on CRAN) and appveyor (not on CRAN) provide API clients for Appveyor.
  • Rocker provides containers for use with Docker. harbor can be used to control docker containers on remote and local hosts and dockertest provides facilities for running tests on docker.
  • Some packages, especially some that are no longer under active development, remain hosted on Google Code. This service is closed to new projects, however, and will shut down in January 2016.
  • drat can be used to distribute pre-built packages via Github or another server.
  • CRAN does not provide package download statistics, but the RStudio CRAN mirror does. packagetrackr (Source) facilitates downloading and analyzing those logs.

CRAN packages:

Related links: