# Tutorial 8: Percentiles and Sampling Distribution for Proportions  #


## Objectives: ##
To become familiar with the R commands used to calculate a percentile for a normal random variable.<br>
To practice using R commands to calculate statistics and probabilities for a sampling distribution of a proportion.

## Instructions: ##
* Do NOT round any of the values.
* For percentile questions:
  * After calculating a percentile, in the markdown block, draw the normal distribution with the appropriate section shaded.
* For Probability Questions:
  * Before calculating a probability, in the markdown block, draw the normal distribution with the appropriate section shaded.
  * After calculating a probability, in the markdown block, indicate what you have found using approriate probability format AND a sentence. <br> Examples of proper format:
    * P(X > 4) = 0.52687495122
    * P(Y < 3) = 0.25447898
    * P(20% < p-hat < 80%) = 0.999587411563
  * After calculating a value that is not a probability, indicate what you have found with a sentence.

## Formulae: ##
* If $X$ is a normal random variable then the $P$th percentile is the value $v$ such that $P(X < v) = P/100$.
* The sampling distribution of $\hat{p}$ is the distribution of all the possible sample proportions.
* For the random variable $\hat{p}$:
  * $\mu_\hat{p} = p$ where $p$ is the population proportion.
  * $\sigma_\hat{p} = \sqrt{\frac{pq}{n}}$ where $q = 1 - p$.
  * If $np \ge 10$ and $nq \ge 10$ then we say $\hat{p}$ is approximately normal with a mean of $p$ and a standard deviation of $\sqrt{\frac{pq}{n}}$, or in short form $\hat{p} \sim N(p,\sqrt{\frac{pq}{n}})$.
* Recall:
  * $P(Y > value) = 1 - P(Y < value)$
  * $P(val1 < Y < val2) = P(Y < val2) - P(Y < val1)$

## Tools: ##

In [1]:
normalplot<-function(m,sd,region=0){
  x<-seq(m-(3.5)*sd,m+(3.5)*sd,length=1000)
  y<-dnorm(x,m,sd)
  plot(x,y,type="l",xlab="",ylab="", bty="n", yaxt="n")
  h <- dnorm(m,m,sd)
  z<-x[x>region[1]]
  z<-z[z<region[2]]
  polygon(c(region[1],z,region[2]),
          c(0,dnorm(z,m,sd),0),col="gray")
  abline(v=m)
  abline(h=0)}

## Data Information: ##
* Suppose a large population of children has a mean IQ score of 99.5 points and a standard deviation of 21.7 points. Let $X$ represent the IQ of a child randomly selected from this population.
* According to "RESEARCH NOTE Caring Canadians, Involved Canadians: 2010" bulletin from Statistics Canada http://sectorsource.ca/sites/default/files/resources/ic-research/research_note_csgvp_tables_en_2012.pdf, in 2010, 55% of Albertans 15 and over volunteered. Let $\hat{p}$ represent the proportion of a sample 25 Albertans that were 15 years or older in 2010 and that volunteered. 

## Question 1. Percentile ##

The R command `qnorm(` &lt;prob&gt; `,` &lt;mean&gt; `,` &lt;sd&gt; `)` determines $P( X \lt$ value $) =$ &lt;prob&gt;<br>
Use `qnorm` to determine the required value then write a sentence indicating the meaning of the value.
Suppose a large population of children has a mean IQ score of 99.5 points and a standard deviation of 21.7 points.<br>
Let $X$ represent the IQ of a child randomly selected from this population.
* a. Fill in the blank. Sixty percent of people have an IQ under ________________.
* b. What is the lower quartile for this random variable?
* c. If you want to start a high IQ club that only admits the 15% of students with the highest IQ what should you set your cut off for admission at?

### Answer 1.a. ##

In [2]:
This is a code cell.

In [None]:
Draw the distribution with the area of interest shaded.

This is a markdown cell, answer with a sentence.

### Answer 1.b. ##

In [None]:
This is a code cell.

In [None]:
Draw the distribution with the area of interest shaded.

This is a markdown cell, answer with a sentence.

### Answer 1.c. ###

In [None]:
This is a code cell.

In [None]:
Draw the distribution with the area of interest shaded.

This is a markdown cell, answer with a sentence.

## Question 2. Sampling Distribution of the Proportion ##

Use normalplot and pnorm to answer the determine the answers to following questions. Follow the instructions given above to answer these questions.<br>It is believed that in 2010, 55% of Albertans 15 and over volunteered.<br>
Let $\hat{p}$ represent the proportion of a sample 25 Albertans that were 15 years or older in 2010 and that volunteered.
* a. Determine $\mu_\hat{p}$.
* b. Determine $\sigma_\hat{p}$.
* c. Explain how we know that $\hat{p}$ is approximately normally distributed.
* d. Determine the probability that the proportion of the sample is less than 54%. Draw the normal distribution showing the appropriate section shaded before calculating the probability.
* e. Determine the probability that the proportion of the sample is more than 55.6%. Draw the normal distribution showing the appropriate section shaded before calculating the probability.
* f. Determine the probability that the proportion of the sample between 53.5% and 56.7%. Draw the normal distribution showing the appropriate section shaded before calculating the probability.

### Answer 2.a. ###

In [None]:
This is a code cell.

Write the answer.

### Answer 2.b. ###

In [None]:
This is a code cell.

Write the answer.

### Answer 2.c. ###

Enter your explanation here.

### Answer 2.d. ###

In [None]:
Draw the distribution with the area of interest shaded.

In [None]:
Calculate the probability.

Write the probability in the proper format and write a sentence describing what you have found.

### Answer 2.e. ###

In [None]:
Draw the distribution with the area of interest shaded.

In [None]:
Calculate the probability.

Write the probability in the proper format and write a sentence describing what you have found.

### Answer 2.f. ###

In [None]:
Draw the distribution with the area of interest shaded.

In [None]:
Calculate the probability.

Write the probability in the proper format and write a sentence describing what you have found.

---
---
#### This tutorial is released under a Creative Commons Attribution-ShareAlike 3.0 Unported.

This tutorial has been adapted from a lab that  was adapted for OpenIntro by Andrew Bray and Mine Çetinkaya-Rundel from a lab written by Mark Hansen of UCLA Statistics.

---
---