<a href="https://colab.research.google.com/github/yardsale8/probability_simulations_in_R/blob/main/1_4_storing_and_transforming_simple_outcomes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
library(dplyr)
library(tidyr)
library(purrr)
library(devtools)
install_github('yardsale8/purrrfect', force = TRUE)
library(purrrfect)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Loading required package: usethis

Downloading GitHub repo yardsale8/purrrfect@HEAD




[36m──[39m [36mR CMD build[39m [36m─────────────────────────────────────────────────────────────────[39m
* checking for file ‘/tmp/Rtmpkvv9oB/remotes13e3859f7a3/yardsale8-purrrfect-d91fae7/DESCRIPTION’ ... OK
* preparing ‘purrrfect’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘purrrfect_1.0.1.tar.gz’



Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)


Attaching package: ‘purrrfect’


The following objects are masked from ‘package:base’:

    replicate, tabulate




## Saving Simple Outcomes
An experiment with only one outcome per trial is said to have simple outcomes.  In this case, we should be able to store the outcomes in column that isn't a list column, but instead holds raw integers/double/characters/Booleans.  To do this, we will need to specify an alternative form of `replicate`

### Example 1 - Flip a fair coin once

Suppose we roll a fair coin and want to know the probility of a head.



### Incorrect Approach - Using `replicate`

Note that if we use `replicate` we get a list column.

In [None]:
coin <- c('H', 'T')
(trials <- replicate(10, sample(coin, 1, replace = TRUE)))

.trial,.outcome
<dbl>,<list>
1,T
2,T
3,H
4,T
5,H
6,H
7,T
8,H
9,T
10,H


In [None]:
trials %>% str

tibble [10 × 2] (S3: tbl_df/tbl/data.frame)
 $ .trial  : num [1:10] 1 2 3 4 5 6 7 8 9 10
 $ .outcome:List of 10
  ..$ : chr "T"
  ..$ : chr "T"
  ..$ : chr "H"
  ..$ : chr "T"
  ..$ : chr "H"
  ..$ : chr "H"
  ..$ : chr "T"
  ..$ : chr "H"
  ..$ : chr "T"
  ..$ : chr "H"



<img src="https://github.com/yardsale8/probability_simulations_in_R/blob/main/img/1_3_simple_outcomes_incorrect.png?raw=true" width="600">

In this case we have an extra, unneeded level of abstraction here.  A list of one character.  This could simple be the character!


### Correct Approach - Use `replicate_chr` to specify the output column type
We can simplify the output by using `replicate_chr` to force the output column to be a column of characters.

In [None]:
(trials <- replicate_chr(10, sample(coin, 1, replace = TRUE)))

.trial,.outcome
<dbl>,<chr>
1,T
2,H
3,H
4,T
5,T
6,T
7,H
8,H
9,H
10,H


In [None]:
trials %>% str

tibble [10 × 2] (S3: tbl_df/tbl/data.frame)
 $ .trial  : num [1:10] 1 2 3 4 5 6 7 8 9 10
 $ .outcome: chr [1:10] "T" "H" "H" "T" ...


<img src="https://github.com/yardsale8/probability_simulations_in_R/blob/main/img/1_3_simple_outcomes_correct.png?raw=true" width="600">

Inspecting the `str` verifies that we have eliminated the extra level of abstraction.

### Comparing `replicate_chr` to `replicate` for simple outcomes

<img src="https://github.com/yardsale8/probability_simulations_in_R/blob/main/img/1_3_simple_outcomes_comparison.png?raw=true" width="600">

In summary, when generating simple outcomes--i.e. a single outcome per trial--use a typed version of replicate like `replicate_chr` instead of `replicate`.  This is because

* `replicate_chr` returns a chr column of individual string.
* `replicate` returns a list column of singletons, where the lists are an unnecessary extra container.

### <font color='red'> Exercise 1.3.1 - Simple Dice Rolls</font>

Set up an experiment that involves rolling a fair 6-sided die once.  Be sure to
1. Make the outcome column have the integer type, and
2. Use `str` to verify the structure.

In [None]:
# Your code here

## Accessing levels of abstraction with `mutate` and `map`

In previous notebooks, we performed simulations that resulted table such that
1. There was row per simulated trial, and
2. The outcomes of each trial were stored in a list column.

In this notebook, we will explore techniques for turning a list outcome column into a random variable, that is a number.

### Using `mutate` and `map` on simple outcome columns

<img src="https://github.com/yardsale8/probability_simulations_in_R/blob/main/img/1_3_mutate_map_and_levels_scalar_column.png?raw=true" width="600">

When working with simple outcomes, we can

* Use `mutate` with a vectorized functions to processes the whole column.<br>
  - Examples: `ifelse`, `mean`, `sd`, etc.<br>
* Use `mutate` + `map` to apply scalar functions to the individual elements.<br>
  - Example: `paste0`, functions with complicated conditional logic

### Example 1 - `ifelse` is a vectorized function

**Task.** Suppose that we consider a head a success and wish to recode the heads and tails as 1 and 0, repectively.

In [None]:
coin <- c('H', 'T')
(trials <- replicate_chr(10, sample(coin, 1, replace = TRUE)))


.trial,.outcome
<dbl>,<chr>
1,H
2,H
3,H
4,T
5,T
6,H
7,H
8,T
9,T
10,H


We can verify that `ifelse` is a vectorized functions by applying it to the whole `.outcome` column.

In [None]:
ifelse(trials$.outcome == 'H', 1, 0)

Consequently, we can use `mutate` to apply this vectorized function to the whole column.

In [None]:
(trials
 %>% mutate(X = ifelse(.outcome == 'H', 1, 0)))

.trial,.outcome,X
<dbl>,<chr>,<dbl>
1,H,1
2,H,1
3,H,1
4,T,0
5,T,0
6,H,1
7,H,1
8,T,0
9,T,0
10,H,1


### Example 2 - Boolean logic is vectorized

**Same Task.** Suppose that we consider a head a success and wish to recode the heads and tails as 1 and 0, repectively.

An alterative to the last approach is to convert the heads and tails to Boolean values.  It turns out the Boolean operators--as well as arithmetic operators--are vectorized in R.


In [None]:
trials$.outcome == 'H'

In [None]:
(trials
 %>% mutate(X = .outcome == 'H'))

.trial,.outcome,X
<dbl>,<chr>,<lgl>
1,H,True
2,H,True
3,H,True
4,T,False
5,T,False
6,H,True
7,H,True
8,T,False
9,T,False
10,H,True


### Example 3 - Packaging the strings in a list

**Contrived Task.** Convert each string into a list containing the string, the coded 1/0, and the Boolean.

**Notes.**

1. We can't use `mutate` alone, as we need to apply the `list` function element by element.
2. Because the three components have different types (chr, int, lgl), we can't store the output in a vector, which have a fixed type in R, and must use a list instead.
3. I will use a named list to provide context

In [None]:
# Helper function for creating a single list
list.output <- \(x) list(str = x, int = ifelse(x == 'H', 1, 0), bool = x == 'H')
list.output('H')

In [None]:
(trials
 %>% mutate(X = map(.outcome, list.output))
 )

.trial,.outcome,X
<dbl>,<chr>,<list>
1,H,"H , 1 , TRUE"
2,H,"H , 1 , TRUE"
3,H,"H , 1 , TRUE"
4,T,"T , 0 , FALSE"
5,T,"T , 0 , FALSE"
6,H,"H , 1 , TRUE"
7,H,"H , 1 , TRUE"
8,T,"T , 0 , FALSE"
9,T,"T , 0 , FALSE"
10,H,"H , 1 , TRUE"


### <font color='red'> Exercise 1.3.2 - Explore the levels of abstraction for the last example</font>

Add a `walk(str)` to the last bit of code, then describe the levels of abstraction.

In [None]:
# Your code here

<font color="orange">
Your description here
</font>

### <font color='red'> Exercise 1.3.2 - Transforming Simple Dice Rolls</font>

Suppose that we are rolling a single fair 6-sided die and consider any roll of 4 or more a success.

Set up an experiment that involves rolling a fair 6-sided die once, then
1. Use `mutate` and `ifelse` to recode the values into 1 for success and 0 for failure.
2. Use `mutate` and Boolean arithmetic to recode the values into `TRUE` for success and `FALSE` for failure.
2. Use `mutate` + `map` to store the values, coded integers, and Boolean values in a list.

In [None]:
# Your code here