<a href="https://colab.research.google.com/github/yardsale8/probability_simulations_in_R/blob/main/4_1_simulating_the_continuous_uniform_distribution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
library(dplyr)
library(tidyr)
library(purrr)
library(devtools)
install_github('yardsale8/purrrfect', force = TRUE)
library(purrrfect)


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Loading required package: usethis

Downloading GitHub repo yardsale8/purrrfect@HEAD



dplyr (1.1.2 -> 1.1.3) [CRAN]


Installing 1 packages: dplyr

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)



[36m──[39m [36mR CMD build[39m [36m─────────────────────────────────────────────────────────────────[39m
* checking for file ‘/tmp/RtmpDRdkIQ/remotes9a101e6711/yardsale8-purrrfect-d91fae7/DESCRIPTION’ ... OK
* preparing ‘purrrfect’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘purrrfect_1.0.1.tar.gz’



Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)


Attaching package: ‘purrrfect’


The following objects are masked from ‘package:base’:

    replicate, tabulate




# Simulating the Uniform Distribution

There are two main ways to simulate a continuous uniform random variable.

1. Convert standard random uniform data (e.g., from a algorithm for pseudo-random numbers), which range from 0 to 1, into the desired range.  This technique will be covered in Chapter 5.
2. Simulating the number of successes directly using `runif`, which we cover here.

Finally, we will illustrate how to use simulations to estimate the expected value.

### Review - Uniform Random Numbers

A continuous random variable $X$ has the [continuous uniform distribution](https://en.wikipedia.org/wiki/Continuous_uniform_distribution), denotes $Uniform(a, b)$,if

1. All feasible values of $X = x$ are $a < x < b$, and
2. The probability of getting a number $c < x < d$ is proportional to the width of the interval, provided $a \le c < d \le b$.


## Strategies for simulating the binomial distribution

1. Convert standard uniform random numbers in to the desired range (see chapter 5).
2. Simulate the number of successes directly using `runif`.

### Example - Waiting for the Bus

To commute a work, a professor must first get on a bus near her house.  Suppose the waiting time (in minutes) at the stop follows a $UNIFORM(0,5)$ distribution.  

1. What is the probability that she has to wait at most 3 minutes?
2. What is the probability that she has to wait at more than 2 minutes?
3. What is the probability that she has to wait at between 2 and 3 minutes?
4. What is the cut-off for the largest 20% of wait times?
5. What is the mean and variance of the wait times?

#### Setting up the simulations

We need to simulate a uniform distribution that ranges from 0 ot 5, which will be accomplished using the `runif` (think Random UNIF, not RUN IF).  Note that this will return floating point numbers, so we will use `replicate_dbl`.

In [1]:
?runif

In [4]:
replicate_dbl(10, runif(1, 0, 5))

.trial,.outcome
<dbl>,<dbl>
1,4.0307461
2,0.2196763
3,1.3868595
4,1.9456184
5,4.4782355
6,2.9401042
7,2.41591
8,3.3325786
9,0.4281534
10,1.8185569


#### Problems 1-3

1. What is the probability that she has to wait at most 3 minutes?
2. What is the probability that she has to wait more than 2 minutes?
3. What is the probability that she has to wait at between 2 and 3 minutes?

In [10]:
a <- 0
b <- 5
num.trials <- 100000
(replicate_dbl(num.trials, runif(1, a, b))
 %>% mutate(at.most.three = .outcome <= 3,
            more.than.two = .outcome > 2,
            between.two.and.three = (.outcome > 2) & (.outcome < 3))
 %>% estimate_all_prob
 %>% mutate(exact.at.most.three = punif(3, a, b),
            exact.more.than.two = 1 - punif(2, a, b),
            exact.between.two.and.three = punif(3, a, b) - punif(2, a, b))
 %>% relocate(exact.at.most.three, .after = at.most.three)
 %>% relocate(exact.more.than.two, .after = more.than.two)

 )

at.most.three,exact.at.most.three,more.than.two,exact.more.than.two,between.two.and.three,exact.between.two.and.three
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
0.6017,0.6,0.59868,0.6,0.20038,0.2


#### 4. What is the cut-off for the largest 20% of wait times?

In [13]:
a <- 0
b <- 5
left.tail <- 1 - 0.20
num.trials <- 100000
(replicate_dbl(num.trials, runif(1, a, b))
 %>% summarise(top.20.percent = quantile(.outcome, left.tail)) # no adjustment for continuous
 %>% mutate(exact.top.20.percent = qunif(left.tail, a, b))
 )

top.20.percent,exact.top.20.percent
<dbl>,<dbl>
4.007949,4


#### d. Estimate the mean and variance.

In [16]:
a <- 0
b <- 5
num.trials <- 100000
(replicate_dbl(num.trials, runif(1, a, b))
 %>% summarise(mu = mean(.outcome),
               sigma.sqr = var(.outcome),
               sigma = sqrt(sigma.sqr))
 %>% mutate(exact.mu = (a + b)/2,
            exact.var = (b - a)^2/12,
            exact.sd = sqrt(exact.var))
 %>% relocate(exact.mu, .after = mu)
 %>% relocate(exact.var, .after = sigma.sqr)
 )

mu,exact.mu,sigma.sqr,exact.var,sigma,exact.sd
<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
2.502847,2.5,2.075955,2.083333,1.440817,1.443376


### <font color="red"> Exercise 4.1.1 - Earthquake when? </font>

Given that you know that 1 earthquake occurred within a 7-day week, the distribution of the time (in days) before the earthquake occurs is uniform(0, 7).  Answer each of the following questions.

1. What is the probability the earthquake happens before the end of the first day?
2. What is the probability the earthquake happens after the start of the 5th day of the week?
3. What is the median waiting time?
4. What is the mean and SD of the waiting times?


In [None]:
# Your code there