### Questions 1 and 2: SAT testing

The SAT is a standardized college admissions test used in the United States. The following two multi-part questions will ask you some questions about SAT testing.

This is a 6-part question asking you to determine some probabilities of what happens when a student guessed for all of their answers on the SAT. Use the information below to inform your answers for the following questions.

----

An old version of the SAT college entrance exam had a -0.25 point penalty for every incorrect answer and awarded 1 point for a correct answer. The quantitative test consisted of 44 multiple-choice questions each with 5 answer choices. Suppose a student chooses answers by guessing for all questions on the test.

#### Question 1a

What is the probability of guessing correctly for one question?

In [5]:
p_correct <- 1/5
p_correct

#### Question 1b

What is the expected value of points for guessing on one question?

In [3]:
1*p_correct -0.25*(1-p_correct)

#### Question 1c

What is the expected score of guessing on all 44 questions?

In [6]:
0

#### Question 1d

What is the standard error of guessing on all 44 questions?

In [7]:
n <- 44
sqrt(n) * abs(1--.25) * sqrt(0.2*0.8)

#### Question 1e

Use the Central Limit Theorem to determine the probability that a guessing student scores 8 points or higher on the test.


Since central limit theorem tell that the distribution of sum of random draws will also be a normal distribution. So we need to create a normal distribution wiht mean 0 and sd as 3.316. Then use the pnorm to calculate the prob < 8 and subtract that. 

In [9]:
1-pnorm(8, 0, 3.316)

#### Question 1f

Set the seed to 21, then run a Monte Carlo simulation of 10,000 students guessing on the test.

(IMPORTANT! If you use R 3.6 or later, you will need to use the command set.seed(x, sample.kind = "Rounding") instead of set.seed(x). Your R version will be printed at the top of the Console window when you start RStudio.)

What is the probability that a guessing student scores 8 points or higher?

In [15]:
set.seed(21, sample.kind="Rounding")

B <- 10000

S <- replicate( B, {
    X <- sample(x = c(1,-0.25), size = 44, replace=TRUE, prob=c(0.2,0.8))
    sum(X)    
})

mean(S > 8)

“non-uniform 'Rounding' sampler used”

The SAT was recently changed to reduce the number of multiple choice options from 5 to 4 and also to eliminate the penalty for guessing.

In this three-part question, you'll explore how that affected the expected values for the test.

-----

#### Question 2a

Suppose that the number of multiple choice options is 4 and that there is no penalty for guessing - that is, an incorrect question gives a score of 0.

What is the expected value of the score when guessing on this new test?

In [17]:
p_correct <- 0.25
p_incorrect <- 0.75

44 * (1*p_correct + 0*p_incorrect)

Question 2b

Using the normal approximation, what is the estimated probability of scoring over 30 when guessing?
Report your answer using scientific notation with 3 significant digits in the format x.xx*10^y. Do not round the values passed to pnorm or you will lose precision and have an incorrect answer.

In [18]:
se <- sqrt(44) * abs(1-0) * sqrt(0.25*0.75)
1-pnorm(30,11,se)

#### Question 2c

Consider a range of correct answer probabilities p <- seq(0.25, 0.95, 0.05) representing a range of student skills.

What is the lowest p such that the probability of scoring over 35 exceeds 80%?

In [24]:
p <- seq(0.25,0.95, 0.05)

get_prob <- function(p){
    se <- sqrt(44) * abs(1-0) * sqrt(p*(1-p))
    avg <- 44 * (1*p+0*(1-p))
    (1-pnorm(35,avg,se))*100
}

data.frame(p, sapply(p, get_prob))
# sapply(p, get_prob)

p,sapply.p..get_prob.
<dbl>,<dbl>
0.25,0.0
0.3,3.704814e-11
0.35,2.914853e-08
0.4,4.290144e-06
0.45,0.0002051844
0.5,0.004433929
0.55,0.0532576
0.6,0.4066871
0.65,2.154449
0.7,8.353214


### Question 3: Betting on Roulette

A casino offers a House Special bet on roulette, which is a bet on five pockets (00, 0, 1, 2, 3) out of 38 total pockets. The bet pays out 6 to 1. In other words, a losing bet yields `-$1` and a successful bet yields `$6`. A gambler wants to know the chance of losing money if he places 500 bets on the roulette House Special.

The following 7-part question asks you to do some calculations related to this scenario.


Question 3a

What is the expected value of the payout for one bet?

In [26]:
p_success <- 5/38
p_not_success <- 1- p_success

6*p_success - 1*p_not_success

#### Question 3b

What is the standard error of the payout for one bet?

In [27]:
abs(6--1) * sqrt(p_success * p_not_success)

#### Question 3c

What is the expected value of the average payout over 500 bets?
Remember there is a difference between expected value of the average and expected value of the sum.


The expected value of average of multiple draws from an urn is the expected value of the urn ( μ ).

In [29]:
-0.078

#### Question 3d

What is the standard error of the average payout over 500 bets?
Remember there is a difference between the standard error of the average and standard error of the sum.


----
The standard deviation of the average of multiple draws from an urn is the sigma / sqrt(n)

In [30]:
2.3662 / sqrt(500)

#### Question 3e

What is the expected value of the sum of 500 bets?


In [31]:
500 * -0.078

#### Question 3f

What is the standard error of the sum of 500 bets?


In [32]:
sqrt(500) * abs(6--1) *sqrt(p_success * p_not_success)

#### Question 3g

Use pnorm with the expected value of the sum and standard error of the sum to calculate the probability of losing money over 500 bets,  Pr(X≤0) .


In [33]:
pnorm(0, -39, 52.910)