# Module 3 Assessment
In this notebook you will use the rules of conditional probability to solve various problems. Please read each statement and answer all questions in the code blocks that contain "your code here". Feel free to add as many blocks as you need, only the original code blocks will be graded.

In [None]:
#Hidden cell that imports the test_that library
library(testthat)

### Problem 1

The planning department for the local county requires a contractor to submit one, two,
three, four, or five forms (depending on the nature of the project) when applying
for a building permit. Let $Y$ be the number of forms required for the next applicant.
The probability that $y$ forms are required is known to be proportional to $y$; that
is, $P(Y = y) = ky$ for $y = 1, ..., 5$.

#### Part a)
What is the value of k? Save your answer as `p1.1`.

In [1]:
p1.1 = 1/15

p1.1


In [None]:
# Hidden test cell

#### Part (b)

What is the probability that at most three forms are required? Save your answer as `p1.2`.

In [2]:
p1.2 = 1/15+2/15+3/15
p1.2


In [None]:
# Hidden test cell

#### Part (c)

What is the probability that between two and four forms (inclusive) are
required? Save your answer as `p1.3`.


In [3]:
p1.3 = 2/15+3/15+4/15
p1.3


In [None]:
# Hidden test cell

#### Part (d)

Could $P(Y = y) = y^2/50$ for $y = 1, ..., 5$ be the pmf of $Y$ ? Store your answer as TRUE or FALSE. Save your answer as `TF_1.4`.

In [4]:
TF_1.4 = FALSE
TF_1.4


In [None]:
# Hidden test cell

### Problem 2

Individuals A and B play a sequence of chess games until one player wins 9 games. A wins an individual game with probability p, and B wins a game with probability 1 − p (i.e., there are no draws). Let X denote the number of games played.

#### Part (a) 

What are the possible values of X? Store your answers the vector `Vals_2.1`. Don't worry about the order.

In [2]:
Vals_2.1 = c(9:17)
Vals_2.1


In [None]:
# Hidden test cell

#### Part (b)

The expression for $P(X = x)$ can be shown to include what known distribution? It will end up being the sum of two of these distributions each multiplied by a different factor. Store your answer as a string with the first letter capitalized. Save your answer as `Dist_2.2`. (Hint: Think of splitting probability into a probability for player A winning in x games summed with the probability of the disjoint event of player B winning in x games.)


In [3]:
Dist_2.2 = "Binomial"
Dist_2.2

In [None]:
# Hidden test cell

In [14]:
y <- round(sum(dbinom(9,12,0.5))*2,4)
y

#### Part (c) 

Let $p = 0.5$. Find $P(X = 12)$. Round your answer to four decimal places. Save your answer as `p2.3`.

In [None]:
# Hidden test cell

### Problem 3

Note the following distribution: X = the leading digit of a randomly selected number from a large accounting ledger. The pmf for this random variable has been found to be defined by:
$$ P(X=x) = f(x) = log_{10}\Big(\frac{x+1}{x}\Big), \ \ x=1,2,...,9. $$

#### Part (a)

Find E(X). Round your answer to two decimal places. Save your answer as `e3.1`.

In [23]:
e3.1 = log(2,base = 10)+ 2*log(3/2,base = 10)+3*log(4/3,base = 10)+4*log(5/4,base = 10)+5*log(6/5,base = 10)+
       6*log(7/6,base = 10)+ 7*log(8/7,base = 10)+8*log(9/8,base = 10)+9*log(10/9,base = 10)

e3.1 = round(e3.1,2)
e3.1

## Benford’s Law — A Simple Explanation (“The Law of Anomalous Numbers” )

- It’s an observation that many datasets, both man-made and from nature, contain more digits that start with the number 1 than any other digit, about 30% of all numbers.

*Datasets comprised of numbers that are products of multiple, independent factors will tend to follow Benford’s law.*

![image.png](attachment:image.png)





In [None]:
# Hidden test cell


### Problem 4

Suppose we have an unfair coin with a probability of $0.6$ of obtaining a heads on any given toss. As is typical for coin toss problems, assume each coin toss is independent.

#### Part (a) 

What is the expected number of flips **before** we get a heads? Round your answer to two decimal places. Save your answer as `e4.1`.

Let X be the number of coin flips required until a head is obtained. So, we need to calculate E(X) (i.e. expected value of X).<br>

Let $E(X|H)$ denote the number of remaining coin flips given I got a head on the first flip. Similarly, let $E(X|T)$ denote the number of remaining coin flips given I got a tail on the first flip.

Now, as $E(X|H)$ denoted the remaining flips after receiving head on the first, it will be equal to 0 as I don't need to flip after getting 1 head.

And, $E(X|T)=E(X)$, as we did not make any progress towards getting 1 head.

$E(X)=0.6∗(1+0)+0.4∗(1+E(X))$

$E(X) = 1.6667$

Since the number of tails is always one less than the number of draws, the expected number of tails also must be one less than the expected number of draws. Therefore the expected number of flips before we get a heads is `0.67`.





In [36]:
e4.1 = 1/0.6-1
e4.1


In [39]:
# Simulation 
set.seed(1)

p <- 0.6

reps <- 10000                         # Total number of simulations.

tosses_to_HEAD <- 0                   # Setting up an empty vector to add output to.

for (i in 1:reps) {
  head <- 0                           # Set the variable 'head' to 0 with every loop.
  counter <- 0                        # Same forlocal variable 'counter'.
  while (head == 0) {
    head <- head + rbinom(1, 1, p)    # Toss a coin and add to 'head'
    counter <- counter + 1            # Add 1 to 'counter'
  }
  tosses_to_HEAD[i] <- counter        # Append number in counter after getting heads.
}

round((mean(tosses_to_HEAD)-1),2)

In [None]:
# Hidden test cell

#### Part (b) 
How many tails do you expect to see before getting three heads? Save your answer as `e4.2`. 

For example, imagine you see the third heads on the tenth trial. This would mean you have seen 2 heads in the first 9 trials followed by a heads on the tenth. Then your answer would be observing $7$ tails. Generalize this to the $i^{th}$ trial.

In [None]:
e4.2 = 2
e4.2

In [None]:
# Hidden test cell