<h1 style="text-align: center;">Dynamic Documents and Reproducible Research Activities</h1>

<p style="text-align: center;">July, 2024</p>

# Dynamic Documents Activities

## Activity 1 - Formatting with Markdown

Let's first play with various ways of formatting text using Markdown.
A useful reference to keep in hand is the [Markdown cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet).

1. Create a Jupyter notebook and save the file under some name (e.g., `my_notebook.ipynb`). 

2. Create three level-three (medium large) section headings: 
- Tables
- Videos, Images, and Links
- Other

The second section (videos, images, and links) should have three subsections (use a slightly smaller section heading): 
- Videos 
- Images 
- Links

3. Create a table with three columns: bird name, colour, coolness (out of 10). Fill in the table with some birds (e.g., pigeons would have a 10/10 coolness factor in case that wasn't clear).

4. In the Videos, Images, and Links section, insert a YouTube video, an image from the internet, and a sentence containing a hyperlink in the corresponding subsections.

5. In the "Other" section, write some R code that produces a plot and then offer some plaintext commentary about the output. 

## Activity 2 - Converting to Different File Formats

Now that we have created some basic content, let's try to export our Jupyter notebook into different file formats. We will begin with some simpler file formats and then try to create a slideshow from our Jupyter notebook. If at any time you encounter difficulties in exporting your Jupyter notebook, consider temporarily removing any R plots from your output. (R plots are sometimes displayed in the `svg` file format and `nbconvert` can have difficulties with this image format.) Alternatively, save your figure as a `png` or `jpg` and then display the figure using Markdown. E.g., use `png()`, `dev.off()`, and the Markdown syntax for inserting images.

1. Export your Jupyter notebook as an HTML webpage.

2. Export your Jupyter notebook as a PDF.

3. Export your notebook as a slideshow ("Reveal.js slides"). You will notice that all of your text is displayed on one slide. To fix this, go to `View > Show Right Sidebar`. After clicking on a code chunk, you will see a section in the right sidebar titled `Slide Type`. Try experimenting with different slide types and see how they change the output of your slideshow. You will need to redownload your slideshow each time you make changes.

# Reproducible Research Activities

## Activity 1 - Code Organization and Saving Plots

Let's assume that we're conducting an analysis on the `mpg` dataset (found in the `ggplot2` R package). We have the following code:

In [None]:
library(ggplot2)
library(dplyr)
data(mpg)
mpg$cyl = as.factor(mpg$cyl)
mpg = mpg %>% filter(class != "2seater")
dim(mpg)
summary(mpg)
table(mpg$class)
ggplot(data=mpg, aes(x=cty, y=hwy))+geom_point()+theme_bw()
ggplot(data=mpg, aes(x=cyl, y=displ))+geom_boxplot()+theme_bw()

1. Organize the above code into separate R scripts, based on logical tasks (loading and cleaning data, etc.).


2. Create a "driver" script that calls on each of the R scripts that you made in the right order.

3. Update the data cleaning steps such that you also remove any observations belonging to the Jeep manufacturer. Document the code for all the data cleaning steps, to make it clear what is happening.

4. Update the data visualization steps such that you save the plots using the functions `pdf()` and `dev.off()` functions. Make sure the plots have width of 10 and height of 5 (these are in "inches", the defaults are 7).

## Activity 2 - Advanced R Coding for Automation (BONUS CHALLENGE)

We're going to try to create some code so that we can create a plot for each pair of the `cty` variable and one of the other variables in the `mpg` dataset (10 plots total, cty vs. hwy, cty vs. class, etc.). To do this, we'll make use of `for` loops and the `if+else` conditionals. Note that `cty` is a numerical/continuous variable. If the other variable in the pair is also continuous, we should make a scatterplot. If the other variable in the pair is categorical (a "factor"), we should make a boxoplot. Here is some initial code to work with:

In [None]:
mpg = as.data.frame(mpg)
Var_Names = names(mpg)
Var_Names = Var_Names[which(Var_Names != "cty")]

## Start with "for" loop here:




**HINT 1:** Iterations in the `for` loop will be over the different values of `Var_Names`

**HINT 2:** You'll need a way to check if the current variable in the current iteration of the `for` loop is a `character` class variable or not. Think about how you can use `is.character()` here.

**HINT 3:** You'll need to use `print` to actually display the plot (ggplot2 works weirdly when it's inside of a `for` loop). Be sure to print the right plot at each iteration.

**HINT 4:** You'll probably need to use `aes_string` instead of `aes`. Recall that `aes` takes the variable names as R objects, but `aes_string` will take them as a string/character like `"cty"` (instead of `cty`). HOWEVER, you can feed `aes_string` an object that contains a character. For example:

`Curr_Var = "hwy"`

`aes_string(x=Curr_Var, y="cty")`

# Dynamic Documents And Reproducible Research Activity Solutions

## Dynamic Documents Activity 1 Solution

To view one possible Markdown solution to Activity 1, double click on the following output to see the source code.

### Tables 
| Bird name     | Colour        | Coolness  |
| ------------- |:-------------:| -----:|
| Parrot           | Many colours  | 9/10  |
| Chicken          | Brown         | 3/10  |
| Golden retriever | Golden        | Not a bird, but still cool! |
    
### Videos, Images, and Links
#### Videos 
<a href="http://www.youtube.com/watch?feature=player_embedded&v=vGazyH6fQQ4" target="_blank"><img src="http://img.youtube.com/vi/vGazyH6fQQ4/0.jpg" alt="IMAGE ALT TEXT HERE" width="240" height="180" border="10" /></a>

#### Images 
<img src="https://upload.wikimedia.org/wikipedia/commons/e/e3/Close-up_picture_of_a_common_wood_pigeon.jpg" alt="Pigeon" width="200"/>
    
#### Links 
[This is a hyperlink](https://www.google.com/)

### Other

In [9]:
set.seed(1)
x <- runif(100) 
y <- rnorm(100, 10*x, 1.0)
df <- data.frame(x = x, y = y)
library(ggplot2)
png("test.png")
ggplot(df, aes(x = x, y = y)) + geom_point()
dev.off()

![image](test.png)

"This is some commentary about the preceding code."

## Reproducible Research Activity 1 Solution

<details>
<summary>Click here for Solution Part I</summary>

The following is an example of one **possible** solution (you may have approached it differently).

Put the following into a "load_and_clean.R" script (you can include all loading of packages as well but not required):

`library(ggplot2)`

`library(dplyr)`

`data(mpg)`

`mpg$cyl = as.factor(mpg$cyl)`

`mpg = mpg %>% filter(class != "2seater")`

Put the following into a "data_summaries.R" script:

`dim(mpg)`

`summary(mpg)`

`table(mpg$class)`

Put the following into a "data_visualizations.R" script:

`ggplot(data=mpg, aes(x=cty, y=hwy))+geom_point()+theme_bw()`

`ggplot(data=mpg, aes(x=cyl, y=displ))+geom_boxplot()+theme_bw()`

</details>

<details>
<summary>Click here for Solution Part II</summary>

The following is an example of one **possible** solution (you may have approached it differently).

The driver script will contain the following (based on Part I solution above):

`source("load_and_clean.R")`

`source("data_summaries.R")`

`source("data_visualizations.R")`

</details>

<details>
<summary>Click here for Solution Part III</summary>

The following is an example of one **possible** solution (you may have approached it differently).

To remove observations containing the "Jeep" manufacturer, we can use the following piece of code and add it to our "load_and_clean.R" script:

`mpg = mpg %>% filter(manufacturer != "jeep")`

And add the following documentation:

`# Convert the number of cylinders variable into a factor`

`mpg$cyl = as.factor(mpg$cyl)`
<br>
<br>

`# Remove vehicles that are classified as "2-Seater"`

`mpg = mpg %>% filter(class != "2seater")`
<br>
<br>

`# Remove vehicles from the Jeep manufacturer`

`mpg = mpg %>% filter(manufacturer != "jeep")`

</details>

<details>
<summary>Click here for Solution Part IV</summary>

We add the following code to the "data_visualizations.R" script:

`pdf("mpg_data_plots.pdf", width=10, height=5)`

`ggplot(data=mpg, aes(x=cty, y=hwy))+geom_point()+theme_bw()`

`ggplot(data=mpg, aes(x=cyl, y=displ))+geom_boxplot()+theme_bw()`

`dev.off()`

</details>

## Reproducible Research Activity 2 Solution

<details>
<summary>Click here for Solution</summary>

<pre>
    <code>
mpg = as.data.frame(mpg)

Var_Names = names(mpg)

Var_Names = Var_Names[which(Var_Names != "cty")]

for(i in 1:(length(Var_Names))){

    Curr_Var = Var_Names[i]

    Factor_Check = is.character(mpg[,Curr_Var])

    if(Factor_Check){

        Plot = ggplot(data=mpg, aes_string(x=Curr_Var, y="cty"))+
                geom_boxplot()+
                theme_bw()
        
    }else if(!Factor_Check){

        Plot = ggplot(data=mpg, aes_string(x=Curr_Var, y="cty"))+
                geom_point()+
                theme_bw()
    
    }

    print(Plot)

}

    </code>
</pre>

</details>