Welcome to the report factory!
The report factory in a nutshell
The report factory is an R package providing a lightweight infrastructure for
.Rmd reports stored in a dedicated folder, with naming
conventions enforcing the date of the report to be part of the file name. Each
compiled report is rendered using
rmarkdown::render(), and newly created files
are stored into a dedicated, time-stamped folder. Several functions facilitate
this workflow, and allow to compile all documents, or only the most recent
Some projects may require different types of analyses to be run repeatedly over time, for instance due to updates in data and inputs.
rmkardown::render is becoming a standard for compiling a single analysis
document, a number of issues remain: one needs to keep track of different
version of the data/inputs, of the analysis code itself, and of different
versions of the ouputs.
The report factory aims to facilitate these tasks by:
- defining a report as an explicitely dated
- providing functions to compile all reports, using by default the most recent versions
- storing each report output in a separate, time-stamped folder
- maintaining compatibility with basic workflows, i.e. all reports can still be
directly compiled using
rmarkdown::render()for testing purposes (although one should make sure to remove the outputs afterwards)
- keeping things simple: no configuration files, no handling of potential dependencies between reports, no caching
- git-friendly: the factory is compatible with git-based workflows, with produced outputs being ignored by git
Components of the factory
A report factory is a folder with the following structure (see next section for creating new factories):
report_sources/: folder storing
.Rmdreports (possibly in sub-folders); files must be named using the convention
[report_base_name]_[yyyy-mm-dd].Rmd; the date format is very important as it will be used for identifying the latest report version; examples:
incidence_curve_flu_2017-12-23; note this folder should not contain any outputs of the
report_outputs: outputs of the reports, automatically generated by the factory, in dedicated folders named as
reportfactory in practice
Installing the package
To install the development version of the package, use:
Note that this requires the package devtools installed.
What does it do?
The main features of the package include:
new_factory(): will create a new report factory, by default adding examples of reports ready for compilation
list_reports(): will list available reports
list_outputs(): will list available outputs
list_deps(): will list packages needed in the reports; use the option
missing = TRUEto list only packages that are missing and need to be installed
install_deps(): will install packages needed in the reports; by default, only install missing packages; use
update = TRUEto force the install of all packages
validate_factory(): will check that the factory is valid, that all report names are unique, etc.
compile_report(): compiles one specific report (name to be matched against the output of
update_reports(): compiles every report, using by default the latest version of each report; use the options
all = TRUEto compile all reports (including old ones)
Note that manual compilation of each document can still be done the usual way,
rmarkdown::render in the source folder; if you do so, make sure you
remove all output files, as they would prevent further updates from the factory.
create a new factory using
new_factory()and move into this new folder
report_sources/, write your
.Rmdreport, using the provided examples as inspiration; remove the examples files; make sure you use the naming conventions explained above, e.g.
check your report by compiling the
.Rmdmanually if needed, e.g.
rmarkdown::render("foobar_2018-01-25.Rmd"); once you are happy with the results, make sure you remove all output files from the source folder
update_reports()to generate all outputs, or
compile_report("foobar_2018-01-25")if you just want to produce time-stamped outputs for this report; check results in the folder
We start by creating a new factory in the temporary folder:
library(reportfactory) #> #> Attaching package: 'reportfactory' #> The following object is masked from 'package:devtools': #> #> install_deps destination <- file.path(tempdir(), "new_factory") destination #>  "/tmp/RtmpyESa5b/new_factory" new_factory(destination) #>  "/tmp/RtmpyESa5b/new_factory" dir() #>  "data" "README.md" "report_sources"
By default, examples of reports (using simulated epidemiological data) are added to the factory; these can be listed by:
list_reports() #>  "contacts_2017-10-29.Rmd" "contacts_2017-10-30.Rmd" #>  "contacts_2017-11-01.Rmd" "epicurve_2017-10-27.Rmd" #>  "epicurve_2017-10-28.Rmd" "epicurve_2017-10-30.Rmd" list_outputs() #> character(0) list_deps() # list all needed packages #>  "earlyR" "epicontacts" "ggplot2" "here" "incidence" #>  "knitr" "magrittr" "projections" "readxl" list_deps(missing = TRUE) # list only missing ones #> character(0)
To compile a single report, one can use:
compile_report("contacts_2017-10-29", quiet = TRUE) #> #> /// compiling report: 'contacts_2017-10-29' #> #> /// 'contacts_2017-10-29' done! list_outputs() #>  "contacts_2017-10-29/compiled_2018-06-18_20-50-28/contacts_2017-10-29.html"
To compile all reports (only most recent versions), use:
update_reports() #> #> /// compiling report: 'contacts_2017-11-01' #> #> /// 'contacts_2017-11-01' done! #> #> /// compiling report: 'epicurve_2017-10-30' #> #> /// 'epicurve_2017-10-30' done! list_outputs() #>  "contacts_2017-10-29/compiled_2018-06-18_20-50-28/contacts_2017-10-29.html" #>  "contacts_2017-11-01/compiled_2018-06-18_20-50-29/contacts_2017-11-01.html" #>  "epicurve_2017-10-30/compiled_2018-06-18_20-50-32/epicurve_2017-10-30.html"
Referring external files in reports
Rmd reports, all file paths should be referred to using
assuming a path from the root directory.
Where to put your data / how to call them?
We recommend storing all data in a
data/ folder in the root directory. When
loading your data in the reports, make sure you use
my_data <- read.csv(here::here("data/linelist_2018-06-11.csv"))
Where to put scripts / how to call them?
The rationale is the same as for data: store your scripts in a dedicated folder
at the root of the project, e.g.
scripts/, and source them from R using
Where to put other files / how to call them?
Follow the same idea as for data and scripts, as long as you do not alter the
minimum infrastructure (
report_sources/ and other files).
Contributors (by alphabetic order):
See details of contributions on:
Contributions are welcome via pull requests.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Maintainer: Thibaut Jombart (firstname.lastname@example.org)