Skip to content

Commit

Permalink
Updated week 8 lecture materials.
Browse files Browse the repository at this point in the history
  • Loading branch information
jdstorey committed Mar 30, 2016
1 parent c7f70a1 commit ceb7601
Show file tree
Hide file tree
Showing 5 changed files with 64 additions and 28 deletions.
10 changes: 8 additions & 2 deletions week8/week8.R
Expand Up @@ -32,6 +32,9 @@ ggplot(data=df) +
library(BSDA)
str(z.test)

## ---- display=FALSE------------------------------------------------------
set.seed(210)

## ------------------------------------------------------------------------
n <- 40
lam <- 14
Expand All @@ -42,6 +45,7 @@ z.test(x=x, sigma.x=stddev, mu=lam)

## ------------------------------------------------------------------------
lam.hat <- mean(x)
lam.hat
stderr <- sqrt(lam.hat)/sqrt(n)
lam.hat - abs(qnorm(0.025)) * stderr # lower bound
lam.hat + abs(qnorm(0.025)) * stderr # upper bound
Expand Down Expand Up @@ -135,14 +139,16 @@ htwt %>% group_by(sex) %>% summarize(sd(height))
t.test(x = m_ht$height, y = f_ht$height, var.equal = TRUE)

## ------------------------------------------------------------------------
htwt <- htwt %>% mutate(diffwt = (weight - repwt), diffht = (height - repht))
htwt <- htwt %>% mutate(diffwt = (weight - repwt),
diffht = (height - repht))
t.test(x = htwt$diffwt) %>% tidy()
t.test(x = htwt$diffht) %>% tidy()

## ------------------------------------------------------------------------
t.test(x=htwt$weight, y=htwt$repwt, paired=TRUE) %>% tidy()
t.test(x=htwt$height, y=htwt$repht, paired=TRUE) %>% tidy()
htwt %>% select(height, repht) %>% na.omit() %>% summarize(mean(height), mean(repht))
htwt %>% select(height, repht) %>% na.omit() %>%
summarize(mean(height), mean(repht))

## ------------------------------------------------------------------------
str(binom.test)
Expand Down
23 changes: 17 additions & 6 deletions week8/week8.Rmd
Expand Up @@ -158,9 +158,9 @@ $$z = \frac{\hat{\lambda} - \lambda_0}{\sqrt{\frac{\hat{\lambda}}{n}}} \mbox{ an

where $Z^*$ is a Normal$(0,1)$ random variable.

## Two-Sided CIs and HTs
## One-Sided CIs and HTs

The two-sided versions of these approximate confidence intervals and hypothesis tests work analogously.
The one-sided versions of these approximate confidence intervals and hypothesis tests work analogously.

The procedures shown for the $\mbox{Normal}(\mu, \sigma^2)$ case with known $\sigma^2$ from last week are utilzied with the appropriate subsitutions as in the above examples.

Expand Down Expand Up @@ -215,7 +215,7 @@ Let $X_1, X_2, \ldots, X_{n_1}$ be iid $\mbox{Poisson}(\lambda_1)$ and $Y_1, Y_2
We have $\hat{\lambda}_1 = \overline{X}$ and $\hat{\lambda}_2 = \overline{Y}$. For large $n_1$ and $n_2$, it approximately holds that:

$$
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\lambda_1}{n_1} + \frac{\lambda_2}{n_2}}} \sim \mbox{Normal}(0,1).
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\hat{\lambda}_1}{n_1} + \frac{\hat{\lambda}_2}{n_2}}} \sim \mbox{Normal}(0,1).
$$

## Normal (Unequal Variances)
Expand Down Expand Up @@ -297,6 +297,10 @@ str(z.test)

Apply `z.test()`:

```{r, display=FALSE}
set.seed(210)
```

```{r}
n <- 40
lam <- 14
Expand All @@ -312,6 +316,7 @@ Confidence interval:

```{r}
lam.hat <- mean(x)
lam.hat
stderr <- sqrt(lam.hat)/sqrt(n)
lam.hat - abs(qnorm(0.025)) * stderr # lower bound
lam.hat + abs(qnorm(0.025)) * stderr # upper bound
Expand Down Expand Up @@ -568,18 +573,24 @@ t.test(x = m_ht$height, y = f_ht$height, var.equal = TRUE)

## Paired Sample Test (v. 1)

First take the difference between the paired observations. Then apply the one-sample t-test.

```{r}
htwt <- htwt %>% mutate(diffwt = (weight - repwt), diffht = (height - repht))
htwt <- htwt %>% mutate(diffwt = (weight - repwt),
diffht = (height - repht))
t.test(x = htwt$diffwt) %>% tidy()
t.test(x = htwt$diffht) %>% tidy()
```

## Paired Sample Test (v. 2)

Enter each sample into the `t.test()` function, but use the `paired=TRUE` argument. This is operationally equivalent to the previous version.

```{r}
t.test(x=htwt$weight, y=htwt$repwt, paired=TRUE) %>% tidy()
t.test(x=htwt$height, y=htwt$repht, paired=TRUE) %>% tidy()
htwt %>% select(height, repht) %>% na.omit() %>% summarize(mean(height), mean(repht))
htwt %>% select(height, repht) %>% na.omit() %>%
summarize(mean(height), mean(repht))
```

# Inference on Binomial Data in R
Expand Down Expand Up @@ -637,7 +648,7 @@ Exercise: Figure out what happened here.

## *OIS* Exercise 6.10

The way a question is phrased can influence a persons response. For example, Pew Research Center conducted a survey with the following question:
The way a question is phrased can influence a person's response. For example, Pew Research Center conducted a survey with the following question:

"As you may know, by 2014 nearly all Americans will be required to have health insurance. [People who do not buy insurance will pay a penalty] while [People who cannot afford it will receive financial help from the government]. Do you approve or disapprove of this policy?"

Expand Down
36 changes: 22 additions & 14 deletions week8/week8.html
Expand Up @@ -218,9 +218,9 @@ <h1>Poisson</h1>
<p>Hypothesis test, <span class="math inline">\(H_0: \lambda=\lambda_0\)</span> vs <span class="math inline">\(H_1: \lambda \not= \lambda_0\)</span>:</p>
<p><span class="math display">\[z = \frac{\hat{\lambda} - \lambda_0}{\sqrt{\frac{\hat{\lambda}}{n}}} \mbox{ and } \mbox{p-value} = {\rm Pr}(|Z^*| \geq |z|)\]</span></p>
<p>where <span class="math inline">\(Z^*\)</span> is a Normal<span class="math inline">\((0,1)\)</span> random variable.</p>
</section><section id="two-sided-cis-and-hts" class="slide level2">
<h1>Two-Sided CIs and HTs</h1>
<p>The two-sided versions of these approximate confidence intervals and hypothesis tests work analogously.</p>
</section><section id="one-sided-cis-and-hts" class="slide level2">
<h1>One-Sided CIs and HTs</h1>
<p>The one-sided versions of these approximate confidence intervals and hypothesis tests work analogously.</p>
<p>The procedures shown for the <span class="math inline">\(\mbox{Normal}(\mu, \sigma^2)\)</span> case with known <span class="math inline">\(\sigma^2\)</span> from last week are utilzied with the appropriate subsitutions as in the above examples.</p>
</section><section id="comment" class="slide level2">
<h1>Comment</h1>
Expand Down Expand Up @@ -254,7 +254,7 @@ <h1>Poisson</h1>
<p>Let <span class="math inline">\(X_1, X_2, \ldots, X_{n_1}\)</span> be iid <span class="math inline">\(\mbox{Poisson}(\lambda_1)\)</span> and <span class="math inline">\(Y_1, Y_2, \ldots, Y_{n_2}\)</span> be iid <span class="math inline">\(\mbox{Poisson}(\lambda_2)\)</span>.</p>
<p>We have <span class="math inline">\(\hat{\lambda}_1 = \overline{X}\)</span> and <span class="math inline">\(\hat{\lambda}_2 = \overline{Y}\)</span>. For large <span class="math inline">\(n_1\)</span> and <span class="math inline">\(n_2\)</span>, it approximately holds that:</p>
<p><span class="math display">\[
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\lambda_1}{n_1} + \frac{\lambda_2}{n_2}}} \sim \mbox{Normal}(0,1).
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\hat{\lambda}_1}{n_1} + \frac{\hat{\lambda}_2}{n_2}}} \sim \mbox{Normal}(0,1).
\]</span></p>
</section><section id="normal-unequal-variances" class="slide level2">
<h1>Normal (Unequal Variances)</h1>
Expand Down Expand Up @@ -309,6 +309,7 @@ <h1><code>BSDA</code> Package</h1>
</section><section id="example-poisson" class="slide level2">
<h1>Example: Poisson</h1>
<p>Apply <code>z.test()</code>:</p>
<pre class="r"><code>&gt; set.seed(210)</code></pre>
<pre class="r"><code>&gt; n &lt;- 40
&gt; lam &lt;- 14
&gt; x &lt;- rpois(n=n, lambda=lam)
Expand All @@ -319,28 +320,30 @@ <h1>Example: Poisson</h1>
One-sample z-Test

data: x
z = 0.95256, p-value = 0.3408
z = 0.41885, p-value = 0.6753
alternative hypothesis: true mean is not equal to 14
95 percent confidence interval:
13.3919 15.7581
13.08016 15.41984
sample estimates:
mean of x
14.575 </code></pre>
14.25 </code></pre>
</section><section id="by-hand-calculations" class="slide level2">
<h1>By Hand Calculations</h1>
<p>Confidence interval:</p>
<pre class="r"><code>&gt; lam.hat &lt;- mean(x)
&gt; lam.hat
[1] 14.25
&gt; stderr &lt;- sqrt(lam.hat)/sqrt(n)
&gt; lam.hat - abs(qnorm(0.025)) * stderr # lower bound
[1] 13.3919
[1] 13.08016
&gt; lam.hat + abs(qnorm(0.025)) * stderr # upper bound
[1] 15.7581</code></pre>
[1] 15.41984</code></pre>
<p>Hypothesis test:</p>
<pre class="r"><code>&gt; z &lt;- (lam.hat - lam)/stderr
&gt; z # test statistic
[1] 0.9525627
[1] 0.4188539
&gt; 2 * pnorm(-abs(z)) # two-sided p-value
[1] 0.3408117</code></pre>
[1] 0.6753229</code></pre>
</section><section id="exercise" class="slide level2">
<h1>Exercise</h1>
<p>Figure out how to get the <code>z.test()</code> function to work on Binomial data.</p>
Expand Down Expand Up @@ -590,7 +593,9 @@ <h1>Test with Equal Variances</h1>
178.0114 164.7143 </code></pre>
</section><section id="paired-sample-test-v.-1" class="slide level2">
<h1>Paired Sample Test (v. 1)</h1>
<pre class="r"><code>&gt; htwt &lt;- htwt %&gt;% mutate(diffwt = (weight - repwt), diffht = (height - repht))
<p>First take the difference between the paired observations. Then apply the one-sample t-test.</p>
<pre class="r"><code>&gt; htwt &lt;- htwt %&gt;% mutate(diffwt = (weight - repwt),
+ diffht = (height - repht))
&gt; t.test(x = htwt$diffwt) %&gt;% tidy()
estimate statistic p.value parameter conf.low
1 0.005464481 0.0319381 0.9745564 182 -0.3321223
Expand All @@ -601,6 +606,7 @@ <h1>Paired Sample Test (v. 1)</h1>
1 2.076503 13.52629 2.636736e-29 182 1.773603 2.379403</code></pre>
</section><section id="paired-sample-test-v.-2" class="slide level2">
<h1>Paired Sample Test (v. 2)</h1>
<p>Enter each sample into the <code>t.test()</code> function, but use the <code>paired=TRUE</code> argument. This is operationally equivalent to the previous version.</p>
<pre class="r"><code>&gt; t.test(x=htwt$weight, y=htwt$repwt, paired=TRUE) %&gt;% tidy()
estimate statistic p.value parameter conf.low
1 0.005464481 0.0319381 0.9745564 182 -0.3321223
Expand All @@ -609,7 +615,8 @@ <h1>Paired Sample Test (v. 2)</h1>
&gt; t.test(x=htwt$height, y=htwt$repht, paired=TRUE) %&gt;% tidy()
estimate statistic p.value parameter conf.low conf.high
1 2.076503 13.52629 2.636736e-29 182 1.773603 2.379403
&gt; htwt %&gt;% select(height, repht) %&gt;% na.omit() %&gt;% summarize(mean(height), mean(repht))
&gt; htwt %&gt;% select(height, repht) %&gt;% na.omit() %&gt;%
+ summarize(mean(height), mean(repht))
Source: local data frame [1 x 2]

mean(height) mean(repht)
Expand Down Expand Up @@ -797,7 +804,8 @@ <h1><code>poisson.test()</code></h1>

r hypothesized rate or rate ratio

alternative indicates the alternative hypothesis and must be one of &quot;two.sided&quot;, &quot;greater&quot; or &quot;less&quot;. You can specify just the initial letter.
alternative indicates the alternative hypothesis and must be one of
&quot;two.sided&quot;, &quot;greater&quot; or &quot;less&quot;. You can specify just the initial letter.

conf.level confidence level for the returned confidence interval.</code></pre>
</section><section id="example-rna-seq" class="slide level2">
Expand Down
23 changes: 17 additions & 6 deletions week8/week8_notes.Rmd
Expand Up @@ -154,9 +154,9 @@ $$z = \frac{\hat{\lambda} - \lambda_0}{\sqrt{\frac{\hat{\lambda}}{n}}} \mbox{ an

where $Z^*$ is a Normal$(0,1)$ random variable.

## Two-Sided CIs and HTs
## One-Sided CIs and HTs

The two-sided versions of these approximate confidence intervals and hypothesis tests work analogously.
The one-sided versions of these approximate confidence intervals and hypothesis tests work analogously.

The procedures shown for the $\mbox{Normal}(\mu, \sigma^2)$ case with known $\sigma^2$ from last week are utilzied with the appropriate subsitutions as in the above examples.

Expand Down Expand Up @@ -211,7 +211,7 @@ Let $X_1, X_2, \ldots, X_{n_1}$ be iid $\mbox{Poisson}(\lambda_1)$ and $Y_1, Y_2
We have $\hat{\lambda}_1 = \overline{X}$ and $\hat{\lambda}_2 = \overline{Y}$. For large $n_1$ and $n_2$, it approximately holds that:

$$
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\lambda_1}{n_1} + \frac{\lambda_2}{n_2}}} \sim \mbox{Normal}(0,1).
\frac{\hat{\lambda}_1 - \hat{\lambda}_2 - (\lambda_1 - \lambda_2)}{\sqrt{\frac{\hat{\lambda}_1}{n_1} + \frac{\hat{\lambda}_2}{n_2}}} \sim \mbox{Normal}(0,1).
$$

## Normal (Unequal Variances)
Expand Down Expand Up @@ -293,6 +293,10 @@ str(z.test)

Apply `z.test()`:

```{r, display=FALSE}
set.seed(210)
```

```{r}
n <- 40
lam <- 14
Expand All @@ -308,6 +312,7 @@ Confidence interval:

```{r}
lam.hat <- mean(x)
lam.hat
stderr <- sqrt(lam.hat)/sqrt(n)
lam.hat - abs(qnorm(0.025)) * stderr # lower bound
lam.hat + abs(qnorm(0.025)) * stderr # upper bound
Expand Down Expand Up @@ -564,18 +569,24 @@ t.test(x = m_ht$height, y = f_ht$height, var.equal = TRUE)

## Paired Sample Test (v. 1)

First take the difference between the paired observations. Then apply the one-sample t-test.

```{r}
htwt <- htwt %>% mutate(diffwt = (weight - repwt), diffht = (height - repht))
htwt <- htwt %>% mutate(diffwt = (weight - repwt),
diffht = (height - repht))
t.test(x = htwt$diffwt) %>% tidy()
t.test(x = htwt$diffht) %>% tidy()
```

## Paired Sample Test (v. 2)

Enter each sample into the `t.test()` function, but use the `paired=TRUE` argument. This is operationally equivalent to the previous version.

```{r}
t.test(x=htwt$weight, y=htwt$repwt, paired=TRUE) %>% tidy()
t.test(x=htwt$height, y=htwt$repht, paired=TRUE) %>% tidy()
htwt %>% select(height, repht) %>% na.omit() %>% summarize(mean(height), mean(repht))
htwt %>% select(height, repht) %>% na.omit() %>%
summarize(mean(height), mean(repht))
```

# Inference on Binomial Data in R
Expand Down Expand Up @@ -633,7 +644,7 @@ Exercise: Figure out what happened here.

## *OIS* Exercise 6.10

The way a question is phrased can influence a persons response. For example, Pew Research Center conducted a survey with the following question:
The way a question is phrased can influence a person's response. For example, Pew Research Center conducted a survey with the following question:

"As you may know, by 2014 nearly all Americans will be required to have health insurance. [People who do not buy insurance will pay a penalty] while [People who cannot afford it will receive financial help from the government]. Do you approve or disapprove of this policy?"

Expand Down
Binary file modified week8/week8_notes.pdf
Binary file not shown.

0 comments on commit ceb7601

Please sign in to comment.