-
Notifications
You must be signed in to change notification settings - Fork 0
/
exercises_quarto.qmd
329 lines (237 loc) · 8.26 KB
/
exercises_quarto.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
---
title: "Writing with Quarto"
format:
html:
code-overflow: wrap
---
::: callout-note
## Learning outcomes
**After finishing this chapter you will be able to:**
- Create a quarto project with RStudio
- Locally render and view a quarto website
- Add markdown content to a quarto document
- Add and execute code chunks to a quarto document
- Add and modify code-chunk options
- Execute code chunks that create figures and create cross references to them
:::
## Initiate a website project
We'll start with creating a website project. In Rstudio:
- Choose **File** \> **New Project..**
- Choose **New Directory**
- Choose **Quarto website**
- Choose a location and name for the website. Leave **Create a git repository** checked
- Finish by pushing the button **Create Project**
![](img/create_project.gif)
Now we have created a template for a quarto website 👏 . It doesn't contain much, but we can already render it and use it as a functional website. In order to view it locally in your browser you can either:
- Open `index.qmd` in Rstudio and press the button **Render**
- Or you can use the **Terminal** and type:
``` bash
quarto preview
```
In both cases your computer will open a browser and locally host the page at a random port.
## Adding content
### Adding text
In this part, we will use quarto to display our project goals. In `index.qmd`, create an introduction where you will state:
*In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:*
- *Understanding the different columns of the data set*
- *Find the top 10 most frequently used baby names in the data for:*
- *girls*
- *boys*
- *Plot the yearly trend of the top 10 baby names*
::: callout-important
## Exercise
Take the above introduction of your `babynames` project, convert it to markdown text, and add it to `index.qmd`.
:::
::: {.callout-tip collapse="true"}
## Answer
This is what your `index.qmd` could look like:
``` markdown
---
title: "quarto-tutorial"
---
In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:
- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
- girls
- boys
- Plot the yearly trend of the top 10 baby names
```
:::
### Adding an image
You can add images with the following syntax:
``` markdown
![image_alt_text](path/or/url/to/image.png)
```
::: callout-important
## Exercise
Check out the [quarto figure guide](https://quarto.org/docs/authoring/figures.html). Find an image of a baby in the public domain on [creative commons](https://search.creativecommons.org/). Get the full URL of the image and add it to your `index.qmd`. Make it 400 pixels in width and align it to the right side of the page.
:::
::: {.callout-tip collapse="true"}
## Answer
Your `index.qmd` could look like this:
``` markdown
---
title: "quarto-tutorial"
---
In this project we aim to visualize the trends of the most frequently used babynames from 1880 to 2017 in the United States. We do this by:
- Understanding the different columns of the data set
- Find the top 10 most frequently used baby names in the data for:
- girls
- boys
- Plot the yearly trend of the top 10 baby names
![](https://cdn.pixabay.com/photo/2016/10/02/06/27/baby-1709013_1280.jpg){fig-align="right" width=400}
```
:::
### Adding a page
We will now add a page for the analysis. We do this by clicking in Rstudio **File** \> **New File..** \> **Quarto Document..**. Choose a title, e.g. `analysis` and press **Create**. Save the newly created file as `analysis.qmd`.
In order to make our new page part of the website, we have to edit `_quarto.yml`. By adding it to the `navbar` item in the `website` markup:
``` yml
website:
title: "quarto-tutorial"
navbar:
left:
- href: index.qmd
text: Home
- about.qmd
- analysis.qmd
```
### Adding code chunks
We can add some code in there. First, we add a chunk that loads the libraries:
```` markdown
```{{r}}
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```
````
### Plotting and cross-referencing
To display the first 10 lines of the babynames table you can add:
```` markdown
```{{r}}
head(babynames) |> kable()
```
````
::: callout-important
## Exercise
Add the two above code as an R code chunk to `analysis.qmd`, and do the following:
- add some text to let the reader know what it does
- suppress the package startup messages with `#| output: false` at the top of the chunk that loads the packages. More info about the hash-pipe [here](https://quarto.org/docs/reference/cells/cells-knitr.html)
- re-render the site.
:::
::: {.callout-tip collapse="true"}
## Answer
```` markdown
---
title: "Analysis"
---
First we load the packages:
```{{r}}
#| output: false
library(babynames)
library(knitr)
library(dplyr)
library(ggplot2)
library(tidyr)
library(pheatmap)
```
The first ten lines of the babynames dataset looks like:
```{{r}}
head(babynames) |> kable()
```
````
:::
Here, I've created two functions that do the following:
- `get_most_frequent`: Gets the most frequent babynames over a time-period.
- `plot_top`: from the output of `get_most_frequent`. Plot the top n most popular names.
::: callout-note
This is not a coding tutorial, so you can ignore the code itself. Just know what is does.
:::
```{r}
get_most_frequent <- function(babynames, select_sex, from = 1950) {
most_freq <- babynames |>
filter(sex == select_sex, year > from) |>
group_by(name) |>
summarise(average = mean(prop)) |>
arrange(desc(average))
return(list(
babynames = babynames,
most_frequent = most_freq,
sex = select_sex,
from = from))
}
plot_top <- function(x, top = 10) {
topx <- x$most_frequent$name[1:top]
p <- x$babynames |>
filter(name %in% topx, sex == x$sex, year > x$from) |>
ggplot(aes(x = year, y = prop, color = name)) +
geom_line() +
scale_color_brewer(palette = "Paired") +
theme_classic()
return(p)
}
```
Plotting them for girls like this:
``` r
get_most_frequent(babynames, select_sex = "F") |>
plot_top()
```
Plotting them for boys like this:
``` r
get_most_frequent(babynames, select_sex = "M") |>
plot_top()
```
::: callout-important
## Exercise
Add the above three code chunks to `analysis.qmd`, and do the following:
- Since the functions are a bit bulky we want to make the chunk foldable. Do this by `#| code-fold: true`
- Describe in a text what you see in the figures. Refer to the individual figures in text by [cross-referencing](https://quarto.org/docs/authoring/figures.html#cross-references).\
:::
::: {.callout-tip collapse="true"}
## Answer
```` markdown
To create a visualization of the most popular baby names, we have created two functions. Click the 'Code' link to view:
```{{r}}
#| code-fold: true
get_most_frequent <- function(babynames, select_sex, from = 1950) {
most_freq <- babynames |>
filter(sex == select_sex, year > from) |>
group_by(name) |>
summarise(average = mean(prop)) |>
arrange(desc(average))
return(list(
babynames = babynames,
most_frequent = most_freq,
sex = select_sex,
from = from))
}
plot_top <- function(x, top = 10) {
topx <- x$most_frequent$name[1:top]
p <- x$babynames |>
filter(name %in% topx, sex == x$sex, year > x$from) |>
ggplot(aes(x = year, y = prop, color = name)) +
geom_line() +
scale_color_brewer(palette = "Paired") +
theme_classic()
return(p)
}
```
Here we call the code to visualize the top 10 most frequent girl names from 1950 onwards:
```{{r}}
#| label: fig-line-girls
#| fig-cap: "Line plot displaying proportion of top 10 girl names by year"
get_most_frequent(babynames, select_sex = "F") |>
plot_top()
```
In @fig-line-girls you can see that there has been a peak in popularity for 'Lisa', 'Jennifer' and 'Jessica'. Interesting! Let's have a look at the boys names:
```{{r}}
#| label: fig-line-boys
#| fig-cap: "Line plot displaying proportion of top 10 boy names by year"
get_most_frequent(babynames, select_sex = "M") |>
plot_top()
```
@fig-line-boys shows that names that were popular before 1990 are relatively infrequent after 2000.
````
:::