-
Notifications
You must be signed in to change notification settings - Fork 10
/
z-advanced-topic-building-an-environment-for-literate-programming.Rmd
363 lines (300 loc) · 12 KB
/
z-advanced-topic-building-an-environment-for-literate-programming.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
---
title: "z - Advanced topic: Building an environment for literate programming"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{z-advanced-topic-building-an-environment-for-literate-programming}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
```{r, include=FALSE}
library(rix)
```
## Introduction
This vignette will walk you through setting up a development environment with
`{rix}` that can be used to compile Quarto documents into PDFs. We are going to
use the [Quarto template for the JSS](https://github.com/quarto-journals/jss) to
illustrate the process. The first section will show a simple way of achieving
this, which will also be ideal for interactive development (writing the doc).
The second section will discuss a way to build the document in a completely
reproducible manner once it's done.
## Starting with the basics (simple but not entirely reproducible)
This approach will not be the most optimal, but it will be the simplest. We will
start by building a development environment with all our dependencies, and we
can then use it to compile our document interactively. But this approach is not
quite reproducible and requires manual actions. In the next section we will show
you to build a 100% reproducible document in a single command.
Since we need both the `{quarto}` R package as well as the `quarto` engine, we
add both of them to the `r_pkgs` and `system_pkgs` of arguments of `{rix}`.
Because we want to compile a PDF, we also need to have `texlive` installed, as
well as some LaTeX packages. For this, we use the `tex_pkgs` argument:
```{r parsermd-chunk-1}
path_default_nix <- tempdir()
rix(r_ver = "4.3.1",
r_pkgs = c("quarto"),
system_pkgs = "quarto",
tex_pkgs = c("amsmath"),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE)
```
(Save these lines into a script called `build_env.R` for instance, and run the
script into a new folder made for this project.)
By default, `{rix}` will install the "small" version of the `texlive`
distribution available on Nix. To see which `texlive` packages get installed
with this small version, you can click
[here](https://search.nixos.org/packages?channel=unstable&show=texlive.combined.scheme-small&from=0&size=50&sort=relevance&type=packages&query=scheme-small).
We start by adding the `amsmath` package then build the environment using:
```
nix-build
```
from a terminal, or `nix_build()` from an interactive R session.
Then, drop into the Nix shell with `nix-shell`, and run `quarto add
quarto-journals/jss`. This will install the template linked above. Then, in the
folder that contains `build_env.R`, the generated `default.nix` and `result`
download the following files from
[here](https://github.com/quarto-journals/jss/):
- article-visualization.pdf
- bibliography.bib
- template.qmd
and try to compile `template.qmd` by running:
```
quarto render template.qmd --to jss-pdf
```
You should get the following error message:
```
Quitting from lines 99-101 [unnamed-chunk-1] (template.qmd)
Error in `find.package()`:
! there is no package called 'MASS'
Backtrace:
1. utils::data("quine", package = "MASS")
2. base::find.package(package, lib.loc, verbose = verbose)
Execution halted
```
So there's an R chunk in `template.qmd` that uses the `{MASS}` package. Change
`build_env.R` to generate a new `default.nix` file that will now add `{MASS}` to
the environment when built:
```{r parsermd-chunk-2}
rix(r_ver = "4.3.1",
r_pkgs = c("quarto", "MASS"),
system_pkgs = "quarto",
tex_pkgs = c("amsmath"),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE)
```
Trying to compile the document results now in another error message:
```
compilation failed- no matching packages
LaTeX Error: File `orcidlink.sty' not found
```
This means that the LaTeX `orcidlink` package is missing, and we can solve the
problem by adding `"orcidlink"` to the list of `tex_pkgs`. Rebuild the
environment and try again to compile the template. Trying again yields a new
error:
```
compilation failed- no matching packages
LaTeX Error: File `tcolorbox.sty' not found.
```
Just as before, add the `tcolorbox` package to the list of `tex_pkgs`. You will
need to do this several times for some other packages. There is unfortunately no
easier way to list the dependencies and requirements of a LaTeX document.
This is what the final script to build the environment looks like:
```{r parsermd-chunk-3}
rix(r_ver = "4.3.1",
r_pkgs = c("quarto", "MASS"),
system_pkgs = "quarto",
tex_pkgs = c(
"amsmath",
"environ",
"fontawesome5",
"orcidlink",
"pdfcol",
"tcolorbox",
"tikzfill"
),
ide = "other",
shell_hook = "",
project_path = path_default_nix,
overwrite = TRUE,
print = TRUE)
```
The template will now compile with this environment. To look for a LaTeX
package, you can use the [search engine on
CTAN](https://ctan.org/pkg/orcidlink?lang=en).
As stated in the beginning of this section, this approach is not the most
optimal, but it has its merits, especially if you're still working on the
document. Once the environment is set up, you can simply work on the doc and
compile it as needed using `quarto render`. In the next section, we will explain
how to build a 100% reproducible document.
## 100% reproducible literate programming
Let's not forget that Nix is not just a package manager, but also a programming
language. The `default.nix` files that `{rix}` generates are written in this
language, which was made entirely for the purpose of building software. If you
are not a developer, you may not realise it but the process of compiling a
Quarto or LaTeX document is very similar to the process of building any piece of
software. So we can use Nix to compile a document in a completely reproducible
environment.
First, let's fork the repo that contains the Quarto template we need. We will
fork [this repo](https://github.com/quarto-journals/jss). This repo contains the
`template.qmd` file that we can change (which is why we fork it, in practice we
would replace this `template.qmd` by our own, finished, source `.qmd` file). Now
we need to change our `default.nix`:
```
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
rpkgs = builtins.attrValues {
inherit (pkgs.rPackages) quarto MASS;
};
tex = (pkgs.texlive.combine {
inherit (pkgs.texlive) scheme-small amsmath environ fontawesome5 orcidlink pdfcol tcolorbox tikzfill;
});
system_packages = builtins.attrValues {
inherit (pkgs) R quarto;
};
in
pkgs.mkShell {
buildInputs = [ rpkgs tex system_packages ];
}
```
to the following:
```
let
pkgs = import (fetchTarball "https://github.com/NixOS/nixpkgs/archive/976fa3369d722e76f37c77493d99829540d43845.tar.gz") {};
rpkgs = builtins.attrValues {
inherit (pkgs.rPackages) quarto MASS;
};
tex = (pkgs.texlive.combine {
inherit (pkgs.texlive) scheme-small amsmath environ fontawesome5 orcidlink pdfcol tcolorbox tikzfill;
});
system_packages = builtins.attrValues {
inherit (pkgs) R quarto;
};
in
pkgs.stdenv.mkDerivation {
name = "my-paper";
src = pkgs.fetchgit {
url = "https://github.com/b-rodrigues/my_paper/";
branchName = "main";
rev = "715e9f007d104c23763cebaf03782b8e80cb5445";
sha256 = "sha256-e8Xg7nJookKoIfiJVTGoJkvCuFNTT83YZ6SK3GT2T8g=";
};
buildInputs = [ rpkgs tex system_packages ];
buildPhase =
''
# Deno needs to add stuff to $HOME/.cache
# so we give it a home to do this
mkdir home
export HOME=$PWD/home
quarto add --no-prompt $src
quarto render $PWD/template.qmd --to jss-pdf
'';
installPhase =
''
mkdir -p $out
cp template.pdf $out/
'';
}
```
So we changed the second part of the file, we're not building a shell anymore
using `mkShell`, but a *derivation*. *Derivation* is Nix jargon for package, or
software. So what is our derivation? First, we clone the repo we forked just
before (I forked the repository and called it `my_paper`):
```
pkgs.stdenv.mkDerivation {
name = "my-paper";
src = pkgs.fetchgit {
url = "https://github.com/b-rodrigues/my_paper/";
branchName = "main";
rev = "715e9f007d104c23763cebaf03782b8e80cb5445";
sha256 = "sha256-e8Xg7nJookKoIfiJVTGoJkvCuFNTT83YZ6SK3GT2T8g=";
};
```
This repo contains our quarto template, and because we're using a specific
commit, we will always use exactly this release of the template for our
document. This is in contrast to before where we used `quarto add
quarto-journals/jss` to install the template. Doing this interactively makes our
project not reproducible because if we compile our Quarto doc today, we would be
using the template as it is today, but if we compile the document in 6 months,
then we would be using the template as it would be in 6 months (we should say
that it is possible to install specific releases of Quarto templates using
following notation: `quarto add quarto-journals/jss@v0.9.2` so this problem can
be mitigated).
The next part of the file contains following lines:
```
buildInputs = [ rpkgs tex system_packages ];
buildPhase =
''
# Deno needs to add stuff to $HOME/.cache
# so we give it a home to do this
mkdir home
export HOME=$PWD/home
quarto add --no-prompt $src
quarto render $PWD/template.qmd --to jss-pdf
'';
```
The `buildInputs` are the same as before. What's new is the `buildPhase`.
This is actually the part in which the document gets compiled. The first
step is to create a `home` directory. This is because Quarto needs to save
the template we want to use in `/home/.cache/deno`. If you're using
`quarto` interactively, that's not an issue, since your home directory
will be used. But with Nix, things are different, so we need to create
an empty directory and specify this as the home. This is what these
two lines do:
```
mkdir home
export HOME=$PWD/home
```
(`$PWD` —Print Working Directory— is a shell variable referring to the current
working directory.)
Now, we need to install the template that we cloned from Github. For this we can
use `quarto add` just as before, but instead of installing it directly from
Github, we install it from the repository that we cloned. We also add the
`--no-prompt` flag so that the template gets installed without asking us for
confirmation. This is similar to how when building a Docker image, we don't want
any interactive prompt to show up, or else the process will get stuck. `$src`
refers to the path of our downloaded Github repository. Finally we can compile
the document:
```
quarto render $PWD/template.qmd --to jss-pdf
```
This will compile the `template.qmd` (our finished paper). Finally, there's the
`installPhase`:
```
installPhase =
''
mkdir -p $out
cp template.pdf $out/
'';
```
`$out` is a shell variable defined inside the build environment and refers to
the path, so we can use it to create a directory that will contain our output
(the compiled PDF file). So we use `mkdir -p` to recursively create all the
directory structure, and then copy the compiled document to `$out/`. We can now
build our document by running `nix_build()`. Now, you may be confused by the
fact that you won't see the PDF in your working directory. But remember that
software built by Nix will always be stored in the Nix store, so our PDF is also
in the store, since this is what we built. To find it, run:
```
readlink result
```
which will show the path to the PDF. You could use this to open the
PDF in your PDF viewer application (on Linux at least):
```
xdg-open $(readlink result)/template.pdf
```
## Conclusion
This vignette showed two approaches, both have their merits: the first approach
that is more interactive is useful while writing the document. You get access to
a shell and can work on the document and compile it quickly. The second approach
is more useful once the document is ready and you want to have a way of quickly
rebuilding it for reproducibility purposes.