# Reproducible reports

## Topic learning objectives

By the end of this topic, students should be able to:

1. Discuss the advantages and disadvantages of using literate code documents (e.g.,
Quarto, Jupyter, R Markdown) for writing analytic reports compared to What You See Is
What You Get (WYSIWYG) editors (e.g., Word, Pages)
2. Convert `.ipynb` and `.Rmd` files to Quarto `.qmd` files
3. Execute and render literate code documents
4. Generate tables of contents, label and number figures and tables, and format
bibliographies in a reproducible and automatted manner

## Reproducible reports vs What You See Is What You Get (WYSIWYG) editors

Reproducible reports are reports where automation is used to update a report when changes to data or data analysis methods lead to changes in data analysis artifacts (e.g., figures, tables, inline text values). This automation is usually specified and controlled by code. In the field of data science, the most common implementations of this are: 

- Quarot
- R Markdown
- Jupyter
- LaTeX 

> Note: R Markdown and Jupyter are not completely separable from LaTeX, as they both wrap LaTeX when the desired output is PDF.

Most implementations of reproducible reports involve a process called rendering, where a document containing code and/or markup language used to specify the text formatting, document layout, figure and table placement and numbering, bibliography formatting, *etc*, is converted to an output more suitable for consumption (e.g., beautifully formatted PDF or html document) by some software.

<img src="img/render-report/render-report.png" width=700>

This contrasts from What You See Is What You Get (WYSIWYG) software or editors (e.g., Microsoft Word, Google Docs, etc), where the document that is being edited looks exactly like the final document - there is no rendering process. WYSIWYG reports are typically easier to get started with and use. However, they are usually quite limited in their ability to automatically update a report when changes to data or data analysis methods lead to changes in data analysis artifacts. This makes them less reproducible, and can lead to errors when repeated manual updating of data analysis artifacts is needed when analysis is iterated on during development. 

<img src="img/word.png" width=300>

## Introduction to Quarto

Quarto is one implementation of a reproducible reporting tool.
It is very user friendly and has become very powerful -
allowing a great deal of control over formatting 
with the ability to render to many different outputs. 

Including:
- PDF
- html
- Word
- Powerpoint presentation
- many more!

It is a mix of markdown primarily used for narrative text,
and code chunks where code can be executed in an engine.
It works very well with either R or Python as the coding engine.
It has many more advanced features for customizing document 
outputs compared to Jupyter notebooks on their own, 
which is why we recommend shifting to this tool when the audience
of your analysis moves beyond you and your data science team.

Quarto also can very easily convert between 
different kinds of reproducible report documents,
making it easy to shift from working in an Jupyter notebook
to this different reproducible report tool.
For example, 

```
quarto convert your_notebook.ipynb
```

There is a wonderful online guide for getting to know Quarto,
we link to it below. 
In these notes, we will generally introduce this tool,
and demonstrate how to: 

- Create document sections and a table of contents.
- Add citations and a bibliography.
- Format figures and figure captions, as well as automatically number them and cross reference them in the narrative text.
- Format tables and table descriptions, as well as automatically number them and cross reference them in the narrative text.
- Execute code inline in the report narrative, so that the text will be automatically updated with the correct value when the report is rendered.
- Set the global and local code chunk options so that no code is viewable in the rendered report, just the code outputs where needed (e.g., figures and tables).

Quarto can do all this and so much more, and so if you are interested in learning more
be sure to refer to the [Quarto Guide](https://quarto.org/docs/guide/).

#### Exercise - get to know Quarto

Let's get to know Quarto! Open RStudio and create a new Quarto document, choosing HTML as the output format. 
Look at the source document that is created, where is the narrative text written? Where is the code written? How does this differ from Jupyter?

#### Exercise - render your first document

There are two ways to render a document from the `qmd` source to the desired output. One is using a button in RStudio - the **"Render"** button that looks like this:

<img src="img/render.png" width=200>

Try clicking that button and see what happens!

Another way you can do this is through code! Try running this in the terminal (replacing `"FILEPATH/FILE.qmd"` with the file path to where you saved this Quarto document:

```
quarto render your_report.qmd --to html
```

#### Exercise - checkout the new visual markdown editor

RStudio has implemented a new feature for working with Quarto to make it more similar to working with Jupyter - it is called the visual markdown editor. Checkout this feature by clicking the visual markdown editor button when you have an R Markdown file open in the editor. The button looks like this:

<img src="img/viz-md-button.png" width=250>