# Read these instructions completely in order to receive full credit

- Before you submit the problem set, make sure everything runs as expected. Go to the menu bar at the top of Jupyter Notebook and click `Kernel > Restart & Run All`. Your code should run from top to bottom with no errors. Failure to do this will result in loss of points.

- You should not use `install.packages()` anywhere. You may assume that we have already installed all the packages needed to run your code.

- Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE" and delete the `stop()` functions. The `stop()` functions produce an error and are there to remind you of cells that need an answer.

- If you are working in a group, make sure you and your collaborators have been added to a group on Canvas as described at the beginning of lecture 2.
- As a backup, *also* fill in your uniqid as well as those of your collaborators below:

Your uniqid: `<replace with your uniqid>`

Uniqids of your collaborator(s): `<replace with their uniqids>`

- **Carefully proofread the PDF that you upload to Canvas. PDFs that have missing or truncated code cannot be graded and will not receive credit.**

---

In [None]:
library(tidyverse)
library(stringr)
library(lubridate)

# STATS 306
## Problem Set 7: Functions

Each problem is worth one point, for a total of ten.

The *Monty Hall Problem* is a famous statistical paradox modeled after the TV game show [Let's Make a Deal](https://www.youtube.com/watch?v=hQpbsD5IueA). The problem goes as follows:
    
    You are a contestant on a game show, and are shown three doors. Behind one of the doors is a new car, and behind the other two doors are Ohio State hoodies. Your goal (obviously) is to pick the door with the new car. You pick a door, say number 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which is shown to have a hoodie. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?
    
The correct answer is to switch, always. This caused something of an uproar when it was first noted by a newspaper columnist in the early 1990s.

You will demonstrate that this answer is correct by writing code that simulates the Monty Hall problem.

#### Problem 1
Write a function `place_car()` which randomly places the car behind door 1-3. (In other words, `place_car()` returns a uniformly distributed random integer between 1 and 3.) This models how the producers set up the show before taping.

In [None]:
# YOUR CODE HERE
stop()

#### Problem 2
Write a function `pick_door()` which selects a door 1-3, according to whatever strategy you like. This models how the contestant initially picks a door.

In [None]:
# YOUR CODE HERE
stop()

#### Problem 3
Write a function `reveal_other_door(car_door, chosen_door)` which, given the door hiding the car, as well as the contestant's chosen door (both pieces of information which are known to Monty), names one of the other doors which does not contain the car. For example, if the car is hiding behind door one, and the contestant chooses door two, then `reveal_other_door(1, 2)` would have to return `3`.

In [None]:
# YOUR CODE HERE
stop()

#### Problem 4
Write a function `chooses_to_switch(first_choice, revealed_door)` which returns `TRUE` if the contestant decides to switch her choice after Monty has revealed the contents of one door. For example, `chooses_to_switch(1, 2)` should return `TRUE` if the contestant decides to switch after choosing door one and being shown the contents of door two.

In [None]:
# YOUR CODE HERE
stop()

#### Problem 5
Finally, write a function `simulate_game()` which uses the functions you defined in the preceding exercises to simulate the entire process: 
- first the car is placed behind a randomly chosen door using `place_car()`, then 
- the contestant chooses a door according to `pick_door()`, 
- then Monty reveals the contents of another door by calling `reveal_other_door()`; and finally,
- the contestant decides whether or not to switch by calling `switches()`.

By experimenting various choices for `chooses_to_switch()` and repeatedly running `simulate_game()`, show that it is always better to switch when offered the choice. How much better is it?

In [None]:
# YOUR CODE HERE
stop()

#### Problem 6
Thanksgiving occurs on the fourth Thursday in November. Write a function `thanksgiving_day(year)` which, given a year, returns the date of Thanksgiving for that year:

    > thanksgiving_day(2019)
    2019-11-28
    > thanksgiving_day(2018)
    2018-11-22
    > thanksgiving_day(3000)
    3000-11-27

In [None]:
library(lubridate)
thanksgiving_day <- function(year) {
    # YOUR CODE HERE
    stop()
}

#### Problem 7
A *recursive* function is a function that calls itself. For example, the following recursive function computes the $k$th factorial number, $$k! := \prod_{i=1}^k i,\quad k \ge 0.$$

In [None]:
factorial = function(k) {
    if (k <= 0) {
        return(1)
    }
    k * factorial(k - 1)
}
factorial(5)

Write a recursive function `fib(k)` that computes $F_k$, the $k$th Fibonacci number. (The Fibonacci sequence is defined by the recurrence $F_{k+2} = F_{k+1} + F_{k}$ for $k\ge 2$, with $F_1 = F_2 = 1$.)

In [None]:
fib = function(k) {
    # YOUR CODE HERE
    stop()
}

#### Problem 8
In Calculus II you learned / are learning that $$\int_0^1 f(x)\,\mathrm{d} x = \lim_{n\to\infty} \frac{1}{n} \sum_{i=1}^n f(i/n).$$ The summation on the right-hand side is called a *Riemann sum*. Setting $n$ to a large (but finite) number gives an algorithm for numerically computing the definite integral on the left-hand side.

Write a function `int01(f)` that takes a single argument `f`, itself a function, and uses this formula to approximate the integral of `f` over the interval $[0,1]$. (Choose $n=1000$ subdivisions in order to compute the sum.)

In [None]:
int01 = function(f) {
    # YOUR CODE HERE
    stop()
}

#### Problem 9
You may have noticed that `int01` is fairly inaccurate. For example, while $\int_0^1 x^2\,\mathrm{d}x = 1/3$, we see that 
```{r}
> int01(function(x) x^2)
[1] 0.3338335
```
is only correct to three decimal places. This is because `int01` does not take the shape of the function `f` into account. To obtain more accuracy, we can use [Simpson's rule](https://en.wikipedia.org/wiki/Simpson%27s_rule):
$$\int_0^1 f(x)\,\mathrm{d}x \approx \frac{1}{6n} \sum_{i=1}^n f\left(\frac{i-1}{n}\right) + 4f\left(\frac{i - 1/2}{n}\right)
+ f\left(\frac{i}{n}\right),$$ for large $n$.

Implement Simpson's rule (again using $n=1000$) in a function called `simps01(f)`, and use it to define a function `rel_acc(f, truth)` which compares the relative accuracy of the Simpson's rule against `int01`. The function should take two arguments, the function to be integrated `f` and $$\texttt{truth} = \int_0^1 f(x)\,\mathrm{d}x,$$
the true value of the integral. It should return the relative accuracy, defined as $$\texttt{relacc(f, truth)} = \frac{|\texttt{simps01(f)} - \texttt{truth}|}{|\texttt{int01(f)}-\texttt{truth}|}.$$ (Hence, small values of `rel_acc` indicate that Simpson's rule is more accurate.)

In [None]:
simps01 = function(f) {
    # YOUR CODE HERE
    stop()
}

rel_acc = function(f, truth) {
    # YOUR CODE HERE
    stop()
}

#### Problem 10
It's 2019 for a few more weeks. How many prime numbers are less than or equal to 2019? Solve this by creating a function `is_prime(x)` which returns `TRUE` if `x` is prime and `FALSE` otherwise. This function can easily be fed into `map` in order to obtain the answer:

In [None]:
is_prime <- function(x) {
    # YOUR CODE HERE
    stop()
}

map_lgl(2:2019, is_prime) %>% sum