/
liftr-intro.Rmd
205 lines (151 loc) 路 6.37 KB
/
liftr-intro.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
---
title: "A Quick Introduction to liftr"
author: "Nan Xiao <<https://nanx.me>>"
date: "`r Sys.Date()`"
output:
rmarkdown::html_vignette:
toc: true
number_sections: true
css: liftr.css
vignette: >
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{A Quick Introduction to liftr}
---
# Introduction
In essence, liftr aims to solve the problem of _persistent reproducible reporting_.
To achieve this goal, it extends the [R Markdown](http://rmarkdown.rstudio.com)
metadata format, and uses Docker to containerize and render R Markdown documents.
# Metadata for containerization
To containerize your R Markdown document, the first step is adding `liftr`
fields to the YAML metadata section of the document. For example:
```yaml
---
title: "The Missing Example of liftr"
author: "Author Name"
date: "`r Sys.Date()`"
output: rmarkdown::html_document
liftr:
maintainer: "Maintainer Name"
email: "name@example.com"
from: "rocker/r-base:latest"
pandoc: true
texlive: false
sysdeps:
- gfortran
cran:
- glmnet
bioc:
- Gviz
remotes:
- "road2stat/liftr"
include: "DockerfileSnippet"
---
```
All available metadata fields are expained below.
## Required metadata
- `maintainer`
Maintainer's name for the Dockerfile.
- `email`
Maintainer's email address for the Dockerfile.
## Optional metadata
- `from`
Base image for building the docker image. Default is
`"rocker/r-base:latest"`. For R users, the images offered
by the [rocker project](https://github.com/rocker-org)
and [Bioconductor](https://bioconductor.org/help/docker/)
can be considered first.
- `pandoc`
Should we install pandoc in the container? Default is `true`.
If pandoc was already installed in the base image, this should be
set to `false` to avoid potential errors. For example, for
[`rocker/rstudio` images](https://registry.hub.docker.com/u/rocker/rstudio/)
and [`bioconductor/...` images](https://www.bioconductor.org/help/docker/),
this option will be automatically set to `false` since they already
have pandoc installed.
- `texlive`
Is TeX environment needed when rendering the document? Default is `false`.
Should be `true` particularly when the output format is PDF.
- `sysdeps`
Debian/Ubuntu system software packages depended in the document.
Please also include software packages depended by the R packages
below. For example, here `gfortran` is required for compiling `glmnet`.
- `cran`
CRAN packages depended in the document.
If only `pkgname` is provided, `liftr` will install the _latest_
version of the package on CRAN. To improve reproducibility,
we recommend to use the package name with a specified version number:
`pkgname/pkgversion` (e.g. `ggplot2/1.0.0`), even if the version
is the current latest version. Note: `pkgversion` must be provided
to install the archived versions of packages.
- `bioc`
Bioconductor packages depended in the document.
- `remotes`
Remote R packages that are not available from CRAN or Bioconductor.
The [remote package naming specification](https://github.com/hadley/devtools/blob/master/vignettes/dependencies.Rmd)
from devtools is adopted here. Packages can be installed from GitHub,
Bitbucket, Git/SVN servers, URLs, etc.
- `include`
The path to a text file that contains custom Dockerfile snippet.
The snippet will be included in the generated Dockerfile.
This can be used to install additional software packages
or further configure the system environment.
Note that this file should be in the same directory as the
input R Markdown file.
# Containerize the document
After adding proper `liftr` metadata to the document YAML data block,
we can use `lift()` to parse the document and generate a Dockerfile.
We will use
[a minimal example](https://github.com/road2stat/liftr/blob/master/inst/examples/liftr-minimal.Rmd)
included in the liftr package. First, we create a new directory and copy
the R Markdown document into the directory:
```{r, eval = FALSE}
path = "~/liftr-minimal/"
dir.create(path)
file.copy(system.file("examples/liftr-minimal.Rmd", package = "liftr"), path)
```
Then, we use `lift()` to parse the document and generate the Dockerfile:
```{r, eval = FALSE}
library("liftr")
input = paste0(path, "liftr-minimal.Rmd")
lift(input)
```
After successfully running `lift()`, the Dockerfile will be in the
`~/liftr-minimal/` directory.
# Render the document
Now we can use `render_docker()` to render the document into an HTML file,
under a Docker container:
```{r, eval = FALSE}
render_docker(input)
```
The function `render_docker()` will parse the Dockerfile, build a new
Docker image, and run a Docker container to render the input document.
If successfully rendered, the output `liftr-minimal.html` will be in
the `~/liftr-minimal/` directory. You can also pass additional arguments
in `rmarkdown::render` to this function.
In order to share the dockerized R Markdown document, simply share the
`.Rmd` file. Other users can use the `lift()` and `render_docker()`
functions to render the document as above.
# Housekeeping
Normally, the argument `prune` is set to `TRUE` in `render_docker()`.
This means any dangling containers or images due to unsuccessful
builds will be automatically cleaned.
To clean up the dangling containers, images, and everything without
specifying names, please use `prune_container_auto()`,
`prune_image_auto()`, and `prune_all_auto()`.
If you wish to manually remove the Docker container or
image (whose information will be stored in an output YAML file)
after sucessful rendering, use `prune_container()` and `prune_image()`:
```{r, eval = FALSE}
purge_image(paste0(path, "liftr-minimal.docker.yml"))
```
The above input YAML file contains the basic information of the
Docker container, image, and commands to render the document.
It is generated by setting `purge_info = TRUE` (default) in `render_docker()`.
# Install Docker
Docker is an essential system requirement when using liftr to render
the R Markdown documents. `install_docker()` will help you find the
proper guide to install and set up Docker in your system.
To check if Docker is correctly installed, use `check_docker_install()`;
to check if the Docker daemon is running, use `check_docker_running()`.
In particular, Linux users should configure Docker to
[run without sudo](https://docs.docker.com/engine/installation/linux/linux-postinstall/).