book/02-repro.Rmd

# Reproducible Workflows {#repro}

<div class="right meme"><img src="images/memes/repro_reports.jpg"
     alt="Top left: young spongebob; top right: Using Base R for your analysis and copy pasting your results into tables in Word; middle left: older angry spongebob in workout clothes; middle right: learning how to use dplyr visualize data with ggplot2 and report your analysis in rmarkdown documents; bottom left: muscular spongebob shirtless in a boxing ring; bottom right: wielding the entire might of the tidyverse (with 50 hex stickers)" /></div>

## Learning Objectives {#ilo-repro}

1. Organise a [project](#projects) [(video)](https://youtu.be/y-KiPueC9xw ){class="video"}
2. Create and compile an [Rmarkdown document](#rmarkdown) [(video)](https://youtu.be/EqJiAlJAl8Y ){class="video"}
3. Edit the [YAML header](#yaml) to add table of contents and other options
4. Include a [table](#repro-tables)
5. Include a [figure](#repro-figures)
6. Report the output of an analysis using [inline R](#inline-r)
7. Add a [bibliography](#bibliography) and [citations](#citations)


## Setup {#setup-repro}

You will be given instructions in Section\ \@ref(new-project) below to set up a new project where you will keep all of your class notes. Section\ \@ref(r-markdown) gives instructions to set up an R Markdown script for this chapter. 

For reference, here are the packages we will use in this chapter.

```{r setup-repro, message=FALSE}
# packages needed for this chapter
library(tidyverse)  # various data manipulation functions
library(knitr)      # for table and image display
library(kableExtra) # for styling tables
library(papaja)     # for APA-style tables
library(gt)         # for fancy tables
library(DT)         # for interactive tables
```

Download the [R Markdown Cheat Sheet](https://www.rstudio.org/links/r_markdown_cheat_sheet).

## Why use reproducible reports?

Have you ever worked on a report, creating a summary table for the demographics, making beautiful plots, getting the analysis just right, and copying all the relevant numbers into your manuscript, only to find out that you forgot to exclude a test run and have to redo everything?

A `r glossary("reproducibility", "reproducible")` report fixes this problem. Although this requires a bit of extra effort at the start, it will more than pay you back by allowing you to update your entire report with the push of a button whenever anything changes.

Studies also show that many, if not most, papers in the scientific literature have reporting errors. For example, more than half of over 250,000 psychology papers published between 1985 and 2013 have at least one value that is statistically incompatible, such as a p-value that is not possible given a t-value and degrees of freedom [@nuijten2016prevalence]. Reproducible reports help avoid transcription and rounding errors.

We will make reproducible reports following the principles of [literate programming](https://en.wikipedia.org/wiki/Literate_programming). The basic idea is to have the text of the report together in a single document along with the code needed to perform all analyses and generate the tables. The report is then "compiled" from the original format into some other, more portable format, such as HTML or PDF. This is different from traditional cutting and pasting approaches where, for instance, you create a graph in Microsoft Excel or a statistics program like SPSS and then paste it into Microsoft Word.

## Organising a project {#projects}

First, we need to get organised. `r glossary("project", "Projects")` in RStudio are a way to group all of the files you need for one project. Most projects include scripts, data files, and output files like the PDF report created by the script or images. 

### File System

Modern computers tend to hide the file system from users, but we need to understand a little bit about how files are stored on your computer in order to get a script to find your data. Your computer's file system is like a big box (or `r glossary("directory")`) that contains both files and smaller boxes, or "subdirectories". You can specify the location of a file with its name and the names of all the directories it is inside.

For example, if Lisa is looking for a file called `report.Rmd`on their Desktop, they can specify the full file `r glossary("path")` like this: `/Users/lisad/Desktop/report.Rmd`, because the `Desktop` directory is inside the `lisad` directory, which is inside the `Users` directory, which is located at the base of the whole file system. If that file was on *your* desktop, you would probably have a different path unless your user directory is also called `lisad`. You can also use the `~` shortcut to represent the user directory of the person who is currently logged in, like this: `~/Desktop/report.Rmd`.

### Working Directory

Where should you put all of your files? You usually want to have all of your scripts and data files for a single project inside one folder on your computer, the `r glossary("working directory")` for that project. You can organise files in subdirectories inside this main project directory, such as putting all raw data files in a directory called `r path("data")` and saving any image files to a directory called `r path("images")`. 

Your script should only reference files in three types of locations, using the appropriate format.

| Where | Example |
|-------|---------|
| on the web               | "https://psyteachr.github.io/reprores-v3/data/5factor.xlsx" |
| in the working directory | "5factor.xlsx" |
| in a subdirectory        | "data/5factor.xlsx" |

::: {.warning data-latex=""}
Never set or change your working directory in a script.
:::

R Markdown files will automatically use the same directory the .Rmd file is in as the working directory.

If your script needs a file in a subdirectory of your working directory, such as, `r path("data/5factor.xlsx")`, load it in using a `r glossary("relative path")` so that it is accessible if you move the working directory to another location or computer:

```{r read-csv, eval = FALSE}
dat <- read_csv("data/5factor.xlsx")  # correct
```

Do not load it in using an `r glossary("absolute path")`:

```{r abs-path, eval = FALSE}
dat <- read_csv("C:/My Files/2020-2021/data/5factor.xlsx")   # wrong
```

::: {.info data-latex=""}
Also note the convention of using forward slashes, unlike the Windows-specific convention of using backward slashes. This is to make references to files work for everyone, regardless of their operating system.
:::

### Naming Things {#naming-things}

Name files so that both people and computers can easily find things. Here are some important principles:

-   file and directory names should only contain letters, numbers, dashes, and underscores, with a full stop (`.`) between the file name and `r glossary("extension")` (that means no spaces!)
-   be consistent with capitalisation (set a rule to make it easy to remember, like always use lowercase)
-   use underscores (`_`) to separate parts of the file name, and dashes (`-`) to separate words in a section
-   name files with a pattern that alphabetises in a sensible order and makes it easy for you to find the file you're looking for
-   prefix a filename with an underscore to move it to the top of the list, or prefix all files with numbers to control their order
-   use YYYY-MM-DD format for dates so they sort in chronological order

For example, these file names are a mess:

-   `r path("Data (Participants) 11-15.xls")`
-   `r path("final report2.doc")`
-   `r path("Participants Data Nov 12.xls")`
-   `r path("project notes.txt")`
-   `r path("Questionnaire Data November 15.xls")`
-   `r path("report.doc")`
-   `r path("report final.doc")`

Here is one way to structure them so that similar files have the same structure and it's easy for a human to scan the list or to use code to find relevant files. See if you can figure out what the missing one should be.

-   `r path("_project-notes.txt")`
-   `r path("data_participants_2021-11-12.xls")`
-   `r path("data_participants_2021-11-15.xls")`
-   `r mcq(c("questionnaire-data_2021-11-15.xls", "data-questionnaire-2021_11_15.xls", answer = "data_questionnaire_2021-11-15.xls", "data_2021-11-15_questionnaire.xls"))`
-   `r path("report_v1.doc")`
-   `r path("report_v2.doc")`
-   `r path("report_v3.doc")`

::: {.try data-latex=""}
Think of other ways to name the files above. Look at some of your own project files and see what you can improve.
:::

### Start a Project {#new-project}

Now that we understand how the file system work and how to name things to make it easier for scripts to access them, we're ready to make our class project. 

First, make a new `r glossary("directory")` where you will keep all of your materials for this class (I called mine `reprores-2022`). You can set this directory to be the default working directory under the General tab of the Global Options. This means that files will be saved here by default if you aren't working in a project. 

::: {.warning data-latex=""}
It can sometimes cause problems if this directory is in OneDrive or if the full file path has special characters or is [more than 260 characters](http://handbook.datalad.org/en/latest/intro/filenaming.html){target='_blank'} on some Windows machines.
:::

Next, choose **`New Project...`** under the **`File`** menu to create a new project called `r path("reprores-class-notes")`. Make sure you save it inside the directory you just made. RStudio will restart itself and open with this new project directory as the working directory. 

```{r img-new-proj, echo = FALSE, out.width = "32%", fig.show = "hold"}
#| fig.cap="Starting a new project."
include_graphics(c("images/repro/new_proj_1.png",
                   "images/repro/new_proj_2.png",
                   "images/repro/new_proj_3.png"))
```


Click on the Files tab in the lower right pane to see the contents of the project directory. You will see a file called `reprores-class-notes.Rproj`, which is a file that contains all of the project information.You can double-click on it to open up the project. 

::: {.info data-latex=""}
Depending on your settings, you may also see a directory called `.Rproj.user`, which contains your specific user settings. You can ignore this and other "invisible" files that start with a full stop.
:::

## R Markdown

In this lesson, we will learn to make an R Markdown document with a table of contents, appropriate headers, code chunks, tables, images, inline R, and a bibliography. 

::: {.info data-latex=""}
There is a new type of reproducible report format called [quarto](https://quarto.org/){target="_blank"} that is very similar to R Markdown. We won't be using quarto in this class because it has a few small differences that get confusing if you're learning both quarto and R Markdown at the same time, but you should be able to pick up quarto very easily once you've learned R Markdown.
:::

We will use `r glossary("R Markdown")` to create reproducible reports, which enables mixing of text and code. A reproducible script will contain sections of code in code blocks. A code block starts and ends with three backtick symbols in a row, with some information about the code between curly brackets, such as `{r chunk-name, echo=FALSE}` (this runs the code, but does not show the text of the code block in the compiled document). The text outside of code blocks is written in `r glossary("markdown")`, which is a way to specify formatting, such as headers, paragraphs, lists, bolding, and links.

```{embed, file = "demos/repro.Rmd"}

```


If you open up a new R Markdown file from a template, you will see an example document with several code blocks in it. To create an HTML or PDF report from an R Markdown (Rmd) document, you compile it.  Compiling a document is called `r glossary("knit", "knitting")` in RStudio. There is a button that looks like a ball of yarn with needles through it that you click on to compile your file into a report. 

::: {.try data-latex=""}
Create a new R Markdown file from the **`File > New File > R Markdown...`** menu. Change the title and author, save the file as `02-repro.Rmd`, then click the knit button to create an html file.
:::

### YAML Header {#yaml}

The `r glossary("YAML")` header is where you can set several options. 

```
---
title: "My Demo Document"
author: "Me"
output:
  html_document:
    toc: true
    toc_float:
      collapsed: false
      smooth_scroll: false
    number_sections: false
---
```

::: {.info data-latex=""}
Try changing the values from `false` to `true` to see what the options do.
:::

The `df_print: kable` option prints data frames using `knitr::kable`. You'll learn below how to further customise tables.

The built-in bootswatch themes are: default, cerulean, cosmo, darkly, flatly, journal, lumen, paper, readable, sandstone, simplex, spacelab, united, and yeti. You can [view and download more themes](https://bootswatch.com/4/).

```{r img-bootswatch, echo=FALSE, fig.cap="Light themes in versions 3 and 4."}
knitr::include_graphics("images/repro/bootswatch.png")
```


### Setup

When you create a new R Markdown file in RStudio using the default template, a setup chunk is automatically created.

```{r knitr-setup, eval=FALSE, verbatim="r setup, include=FALSE"}
knitr::opts_chunk$set(echo = TRUE)
```

You can set more default options for code chunks here. See the [knitr options documentation](https://yihui.name/knitr/options/){target="_blank"} for explanations of the possible options.

```{r knitr-setup2, eval=FALSE, verbatim="r setup, include=FALSE"}
knitr::opts_chunk$set(
  fig.width  = 8, 
  fig.height = 5, 
  fig.path   = 'images/',
  echo       = FALSE, 
  warning    = TRUE, 
  message    = FALSE,
  cache      = FALSE
)
```

The code above sets the following options:

* `fig.width  = 8` : default figure width is 8 inches (you can change this for individual figures)
* `fig.height = 5` : default figure height is 5 inches
* `fig.path   = 'images/'` : figures are saved in the directory "images"
* `echo       = FALSE` : do not show code chunks in the rendered document
* `warning    = FALSE` : do not show any function warnings
* `message    = FALSE` : do not show any function messages
* `cache      = FALSE` : run all the code to create all of the images and objects each time you knit (set to `TRUE` if you have time-consuming code)


Find a list of the current chunk options by typing `r hl(str(knitr::opts_chunk$get()))` in the console.


You can also add the packages you need in this chunk using `r hl(library())`. Often when you are working on a script, you will realize that you need to load another add-on package. Don't bury the call to `library(package_I_need)` way down in the script. Put it in the top, so the user has an overview of what packages are needed.

::: {.try data-latex=""}
We'll be using function from the package `r pkg("tidyverse")`, so load that in your setup chunk.
:::

### Structure {#structure}

If you include a table of contents (`toc`), it is created from your document headers. Headers in `r glossary("markdown")` are created by prefacing the header title with one or more hashes (`#`). 

Use the following structure when developing your own analysis scripts: 

* load in any add-on packages you need to use
* define any custom functions
* load or simulate the data you will be working with
* work with the data
* save anything you need to save

::: {.try data-latex=""}
Delete the default text and add some structure to your document by creating headers and subheaders. We're going to load some data, create a summary table, plot the data, and analyse it.
:::

### Code Chunks

You can include `r glossary("chunk", "code chunks")` that create and display images, tables, or computations to include in your text. Let's start by loading some data.

First, create a code chunk in your document. This code loads some data from the web. 

```{r}
pets <- read_csv("https://psyteachr.github.io/reprores/data/pets.csv",
                 show_col_types = FALSE)
```

### Comments

You can add comments inside R chunks with the hash symbol (`#`). The R interpreter will ignore characters from the hash to the end of the line.

```{r}
# simulating new data

n <- nrow(pets) # the total number of pet
mu <- mean(pets$score) # the mean score for all pets
sd <- sd(pets$score) # the SD for score for all pets

simulated_scores <- rnorm(n, mu, sd)
```

It's usually good practice to start a code chunk with a comment that explains what you're doing there, especially if the code is not explained in the text of the report.

If you name your objects clearly, you often don't need to add clarifying comments. For example, if I'd named the three objects above `total_pet_n`, `mean_score` and `sd_score`, I would omit the comments.

Another use for comments is to "comment out" a section of code that you don't want to run, but also don't want to delete. For example, you might include the code used to install a package in your script, but you should always comment it out so the script doesn't force a lengthy installation every time it's run. 

```{r}
# install.packages("dplyr")
# install.packages("tidyr")
# install.packages("ggplot2")
```

::: {.info data-latex=""}
You can comment or uncomment multiple lines at once by selecting the lines and typing Cmd-shift-C (Mac) or Ctrl-shift-C (Windows).
:::

It's a bit of an art to comment your code well. The best way to develop this skill is to read a lot of other people's code and have others review your code.

### In-line R {#inline-r}

Now let's analyse the pets data to see if cats are heavier than ferrets. First we'll run the analysis code. Then we'll save any numbers we might want to use in our manuscript to variables and round them appropriately. Finally, we'll use `r hl(glue::glue())` to format a results string.

```{r}
# analysis
cat_weight <- filter(pets, pet == "cat") %>% pull(weight)
ferret_weight <- filter(pets, pet == "ferret") %>% pull(weight)
weight_test <- t.test(cat_weight, ferret_weight)

# round individual values you want to report
t <- weight_test$statistic %>% round(2)
df <- weight_test$parameter %>% round(1)
p <- weight_test$p.value %>% round(3)
# handle p-values < .001
p_symbol <- ifelse(p < .001, "<", "=")
if (p < .001) p <- .001

# format the results string
weight_result <- glue::glue("t = {t}, df = {df}, p {p_symbol} {p}")
```

You can insert the results into a paragraph with inline R code that looks like this: 

<pre><code>Cats were significantly heavier than ferrets (&#96;r weight_result&#96;).</code></pre>

**Rendered text:**  
Cats were significantly heavier than ferrets (`r weight_result`). 


### Tables {#repro-tables}

Next, create a code chunk where you want to display a table of the descriptives (e.g., Participants section of the Methods). We'll use tidyverse functions you will learn in the [data wrangling lectures](#tidyr) to create summary statistics for each group.

```{r}
summary_table <- pets %>%
  group_by(pet) %>%
  summarise(
    n = n(),
    mean_weight = mean(weight),
    mean_score = mean(score)
  )

# print
summary_table
```

::: {.warning data-latex=""}
The table above will print in tibble format in the interactive view, but will use the format from the `df_print` setting in the YAML header when you knit. 
:::


The table above is OK, but it could be more reader-friendly by changing the column labels, rounding the means, and adding a caption. You can use `r hl(knitr::kable())` for this, or more specialised functions from other packages to format your tables.

<!-- Tab links -->
<div class="tab">
  <button class="tablinks" onclick="openCity(event, 'knitr')">knitr</button>
  <button class="tablinks" onclick="openCity(event, 'kableExtra')">kableExtra</button>
  <button class="tablinks" onclick="openCity(event, 'papaja')">papaja</button>
  <button class="tablinks" onclick="openCity(event, 'gt')">gt</button>
</div>

<!-- Tab content -->
<div id="knitr" class="tabcontent">

```{r kable-demo}
newnames <- c("Pet Type", "N", "Mean Weight", "Mean Score")

knitr::kable(summary_table, 
             digits = 2, 
             col.names = newnames,
             caption = "Summary statistics for the pets dataset.")
```

</div>

<!-- Tab content -->
<div id="kableExtra" class="tabcontent">

The [kableExtra](https://haozhu233.github.io/kableExtra/awesome_table_in_html.html) package gives you a lot of flexibility with table display.

```{r kableExtra-demo, warning = FALSE}
library(kableExtra)

kable(summary_table, 
      digits = 2, 
      col.names = c("Pet Type", "N", "Weight", "Score"),
      caption = "Summary statistics for the pets dataset.") |>
  kable_classic() |>
  kable_styling(full_width = FALSE, font_size = 20) |>
  add_header_above(c(" " = 2, "Means" = 2)) |>
  kableExtra::row_spec(2, bold = TRUE, background = "lightyellow")
```

</div>

<!-- Tab content -->
<div id="papaja" class="tabcontent">

[papaja](https://crsh.github.io/papaja_man/reporting.html#tables) helps you create APA-formatted manuscripts, including tables.

```{r papaja-demo}
papaja::apa_table(summary_table, 
                  col.names = c("Pet Type", "N", "Weight", "Score"),
                  caption = "Summary statistics for the pets dataset.",
                  col_spanners = list("Means" = c(3, 4)))
```

</div>

<!-- Tab content -->
<div id="gt" class="tabcontent">

The [gt](https://gt.rstudio.com/index.html) package allows for even more customisation.

```{r gt-demo}
library(gt)

gt(summary_table, caption = "Summary statistics for the pets dataset.") |>
  fmt_number(columns = c(mean_weight, mean_score),
            decimals = 2) |>
  cols_label(pet = "Pet Type", 
             n = "N", 
             mean_weight = "Weight", 
             mean_score = "Score") |>
  tab_spanner(label = "Means",
              columns = c(mean_weight, mean_score)) |>
 opt_stylize(style = 6, color = "blue")
```

</div>


### Images {#repro-figures}

Next, create a code chunk where you want to display an image in your document. Let's put it in the Results section. We'll use some code that you'll learn more about  in the [data visualisation lecture](#ggplot) to show violin-boxplots for the groups.

Notice how the figure caption is formatted in the chunk options.


```{r pet-plot, eval = FALSE, verbatim = 'r pet-plot, fig.cap="Figure 1. Scores by pet type and country."'}
#| fig.cap: "Figure 1. Scores by pet type and country."
ggplot(pets, aes(pet, score, fill = country)) +
  geom_violin(alpha = 0.5) +
  geom_boxplot(width = 0.25, 
               position = position_dodge(width = 0.9),
               show.legend = FALSE) +
  scale_fill_manual(values = c("orange", "dodgerblue")) +
  labs(x = "", y = "Score") +
  theme(text = element_text(size = 20, family = "Times"))
```


```{r pet-plot-out, echo = FALSE}
#| fig.cap: "Figure 1. Scores by pet type and country."
ggplot(pets, aes(pet, score, fill = country)) +
  geom_violin(alpha = 0.5) +
  geom_boxplot(width = 0.25, 
               position = position_dodge(width = 0.9),
               show.legend = FALSE) +
  scale_fill_manual(values = c("orange", "dodgerblue")) +
  labs(x = "", y = "Score") +
  theme(text = element_text(size = 20, family = "Times"))
```


::: {.info data-latex=""}
The last line changes the default text size and font, which can be useful for generating figures that meet a journal's requirements.
:::


You can also include images that you did not create in R using the typical markdown syntax for images: 

``` md
![All the Things by [Hyperbole and a Half](http://hyperboleandahalf.blogspot.com/)](images/memes/x-all-the-things.png){style="width: 50%"}
```

![All the Things by [Hyperbole and a Half](http://hyperboleandahalf.blogspot.com/)](images/memes/x-all-the-things.png){style="width: 50%"}


### Linked documents

If you need to create longer reports with links between sections, you can edit the YAML to use an output format from the `r pkg("bookdown")` package. `bookdown::html_document2` is a useful one that adds figure and table numbers automatically to any figures or tables with a caption and allows you to link to these by reference.

To create links to tables and figures, you need to name the code chunk that created your figures or tables, and then call those names in your inline coding:

```{r, eval = FALSE, verbatim="r my-table"}
# table code here
```

```{r, eval = FALSE, verbatim="r my-figure"}
# figure code here
```

```
See Table\ \@ref(tab:my-table) or Figure\ \@ref(fig:my-figure).
```

::: {.warning data-latex=""}
The code chunk names can only contain letters, numbers and dashes. If they contain other characters like spaces or underscores, the referencing will not work.
:::

You can also link to different sections of your report by naming your headings with `{#}`:

```
# My first heading {#heading-1}

## My second heading {#heading-2}

See Section\ \@ref(heading-1) and Section\ \@ref(heading-2)

```
The code below shows how to link text to figures or tables in a full report using the built-in `diamonds` dataset - use your `reports.Rmd` to create this document now. You can see the [HTML output here](demos/html_document2.html).

`r hide("Linked document code")`

```{embed, file = "demos/html_document2.Rmd"}
# readLines("demos/html_document2.Rmd") |> 
#   paste(collapse = "\n") |>
#   paste0("<code style='font-size: smaller;'><pre>\n", ., "\n</pre></code>") |>
#   cat()
```

`r unhide()`

This format defaults to numbered sections, so set `number_sections: false` in the `r glossary("YAML")` header if you don't want this. If you remove numbered sections, links like `\@ref(conclusion)` will show "??", so you need to use URL link syntax instead, like this:

```
See the [last section](#conclusion) for concluding remarks.
```


## Bibliography

There are several ways to do in-text references and automatically generate a [bibliography](https://bookdown.org/yihui/rmarkdown-cookbook/bibliography.html){target="_blank"} in R Markdown. Markdown files need to link to a BibTex or JSON file (a plain text file with references in a specific format) that contains the references you need to cite. You specify the name of this file in the YAML header, like `bibliography: refs.bib` and cite references in text using an at symbol and a shortname, like `[@tidyverse]`. You can also include a Citation Style Language (.csl) file to format your references in, for example, APA style.

```
---
title: "My Paper"
author: "Me"
output: 
  html_document:
    toc: true
bibliography: refs.bib
csl: apa.csl
```

### Converting from reference software

Most reference software like EndNote or Zotero has exporting options that can export to BibTeX format. You just need to check the shortnames in the resulting file.

::: {.warning data-latex=""}
Please start using a reference manager consistently through your research career. It will make your life so much easier. Zotero is probably the best one.
:::

1. If you don't already have one, set up a [Zotero](https://www.zotero.org/){target="_blank"} account  
2. Add the [connector for your web browser](https://www.zotero.org/download/){target="_blank"} (if you're on a computer you can add browser extensions to)  
3. Navigate to [Easing Into Open Science](https://doi.org/10.1525/collabra.18684){target="_blank"} and add this reference to your library with the browser connector  
4. Go to your library and make a new collection called "Open Research" (click on the + icon after **`My Library`**)  
5. Drag the reference to Easing Into Open Science into this collection  
6. Export this collection as BibTex  

```{r zotero, echo = FALSE}
#| fig.cap: Export a bibliography file from Zotero
knitr::include_graphics("images/repro/zotero.png")
```

The exported file should look like this:

```{embed, file = "demos/export-data.bib"}

```


### Creating a BibTeX File

You can also add references manually. In RStudio, go to **`File`** > **`New File...`** > **`Text File`** and save the file as "refs.bib".

Next, add the line `bibliography: refs.bib` to your YAML header.

### Adding references {#references}

You can add references to a journal article in the following format:

```
@article{shortname,
  author = {Author One and Author Two and Author Three},
  title = {Paper Title},
  journal = {Journal Title},
  volume = {vol},
  number = {issue},
  pages = {startpage--endpage},
  year = {year},
  doi = {doi}
}
```

See [A complete guide to the BibTeX format](https://www.bibtex.com/g/bibtex-format/){target="_blank"} for instructions on citing books, technical reports, and more.

You can get the reference for an R package using the functions `citation()` and `toBibtex()`. You can paste the bibtex entry into your bibliography.bib file. Make sure to add a short name (e.g., "ggplot2") before the first comma to refer to the reference.

```{r}
citation(package="ggplot2") %>% toBibtex()
```


[Google Scholar](https://scholar.google.com/) entries have a BibTeX citation option. This is usually the easiest way to get the relevant values if you can't add a citation through the Zotero browser connector, although you have to add the DOI yourself. You can keep the suggested shortname or change it to something that makes more sense to you.

```{r google-scholar, echo = FALSE, fig.cap = "Get BibTex citations from Google Scholar."}
knitr::include_graphics("images/present/google-scholar.png")
```


### Citing references {#citations}

You can cite references in text like this: 

```
This tutorial uses several R packages [@tidyverse;@rmarkdown].
```

This tutorial uses several R packages [@tidyverse;@rmarkdown].

Put a minus in front of the @ if you just want the year:

```
Kathawalla and colleagues [-@kathawalla_easing_2021] explain how to introduce open research practices into your postgraduate studies.
```

Kathawalla and colleagues [-@kathawalla_easing_2021] explain how to introduce open research practices into your postgraduate studies.

### Uncited references

If you want to add an item to the reference section without citing, it, add it to the YAML header like this:

```
nocite: |
  @kathawalla_easing_2021, @broman2018data, @nordmann2022data
```

Or add all of the items in the .bib file like this:

```
nocite: '@*'
```

### Citation Styles

You can search a [list of style files](https://www.zotero.org/styles){target="_blank"} for various journals and download a file that will format your bibliography for a specific journal's style. You'll need to add the line `csl: filename.csl` to your YAML header. 

::: {.try data-latex=""}
Add some citations to your refs.bib file, reference them in your text, and render your manuscript to see the automatically generated reference section. Try a few different citation style files.
:::

### Reference Section

By default, the reference section is added to the end of the document. If you want to change the position (e.g., to add figures and tables after the references), include `<div id="refs"></div>` where you want the references. 


::: {.try data-latex=""}
Add in-text citations and a reference list to your report.
:::

## Custom Templates {#custom-templates}

Some packages provide custom R Markdown templates. `reprores` has a Report template that shows all of the possible options in the YAML header, has bibliography and style files, and explains how to set up linked figures and tables. Because it contains multiple files, RStudio will ask you to create a new folder to keep all of the files in.

```{r img-custom-rmd, echo=FALSE, fig.cap="The custom R markdown template from reprores.", out.width = '75%'}

knitr::include_graphics("images/custom-rmd.png")
```

::: {.try data-latex=""}
Start a report with the Report template and knit it. Try changing or deleting options.
:::

## Glossary {#glossary-repro}

`r glossary_table()`


## Further Resources {#resources-repro}

* [Chapter 27: R Markdown](http://r4ds.had.co.nz/r-markdown.html) in *R for Data Science*
* [R Markdown Cheat Sheet](https://github.com/rstudio/cheatsheets/raw/master/rmarkdown.pdf)
* [R Markdown reference Guide](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf)
* [R Markdown Tutorial](https://rmarkdown.rstudio.com/lesson-1.html)
* [R Markdown: The Definitive Guide](https://bookdown.org/yihui/rmarkdown/) by Yihui Xie, J. J. Allaire, & Garrett Grolemund
* [Project Structure](https://slides.djnavarro.net/project-structure/) by Danielle Navarro
* [How to name files](https://speakerdeck.com/jennybc/how-to-name-files) by Jenny Bryan
* [Papaja](https://crsh.github.io/papaja_man/) Reproducible APA Manuscripts
* [The Turing Way](https://the-turing-way.netlify.app/)