# 3.2.10a

We wish to integrate $f(x)$ from 48 to 72. This is the same as integrating $0.02e^{-0.02x}$ from 0 to 24, which we know has an exponential distribution with parameter $\lambda = 0.02$. Such a function has the CDF $F(x) = 1 - e^{-0.02x}$ for $x > 0$. Therefore, the answer is
\begin{align}
F(24) - F(0) &= (1 - e^{-0.02 \cdot 24}) - (1 - e^{-0.02 \cdot 0}) \\
= 1 - e^{-0.48}.
\end{align}

We can confirm this with R:

In [1]:
1 - exp(-0.48)

In [2]:
integrate(f = function (x) 0.02*exp(-0.02*(x - 48)), lower = 48, upper = 72)

0.3812166 with absolute error < 4.2e-15

# 3.2.10b

We integrate $f(x)$ from 96 to 168, which is the same as integrating $0.02e^{-0.02x}$ from 48 to 120. Therefore, the answer is $(1 - e^{-0.02 \cdot 120}) - (1 - e^{-0.02 \cdot 48}) = e^{-0.96} - e^{-2.40}$. We can confirm this with R:

In [1]:
exp(-0.96) - exp(-2.40)

In [2]:
integrate(f = function (x) 0.02*exp(-0.02*(x - 48)), lower = 96, upper = 168)

0.2921749 with absolute error < 3.2e-15

# 3.3.7a

The median of a probability distribution with CDF $F$ is the value $x$ such that $F(x) = 1/2$. Therefore, we need only solve
\begin{align}
F(x) &= 1/2 \\
\frac{x^2}{4} &= 1/2 \\
x &= \sqrt{2}
\end{align}

To find the interquartile range, we need to find the first and third quartiles, which are the values $x_1, x_3$ such that $F(x_1) = 1/4, F(x_3) = 3/4$. By the same procedure above, we get $x_1 = 1, x_3 = \sqrt{3}$. Therefore, the interquartile range is $\sqrt{3} - 1$.

# 3.3.7b

To find the mean and standard deviation, we need to find the PDF $f$ first. We just have to differentiate the CDF, so the $f(x) = F'(x) = x/2$ for $x$ between 0 and 2.

\begin{align}
E[X] &= \int_{-\infty}^{\infty} xf(x) dx \\
&= \int_0^2 \frac{x^2}{2} dx \\
&= \frac{x^3}{6}|_0^2 = \frac{4}{3}
\end{align}

\begin{align}
E[X^2] &= \int_{-\infty}^{\infty} x^2f(x) dx \\
&= \int_0^2 \frac{x^3}{2} dx \\
&= \frac{x^4}{8}|_0^2 = \frac{16}{8} = 2
\end{align}

\begin{align}
V[X] &= E[X^2] - E[X]^2 \\
&= 2 - (\frac{4}{3})^2 = \frac{18}{9} - \frac{16}{9} = \frac{2}{9}
\end{align}

Therefore, $\sigma_X = \sqrt{V[X]} = \sqrt{\frac{2}{9}}$.

We can confirm this with R:

In [20]:
integrate(function (x) x^2 / 2, lower = 0, upper = 2)

1.333333 with absolute error < 1.5e-14

In [21]:
integrate(function(x) x^3 / 2, lower = 0, upper = 2)

2 with absolute error < 2.2e-14

# 3.4.2

Since we are concerned with whether cars stop at the intersection when no other cars are visible, we can assume that the probability that any given car stops is independent of the probability that any other car stops. Therefore, the event that each car stops can be considered a Bernoulli trial with probability $p = 0.3$ of success. Since $X$ is the number of cars out of 15 that stop, we can consider $X$ to be a binomial random variable with parameters $n = 15$ and $p = 0.3$. The expected value $E[X]$ is $np = 4.5$, and the variance $V[X] = np(1-p) = 3.15$.

\begin{equation}P(X = 6) = \binom{15}{6}0.3^60.7^9\end{equation}
\begin{equation}P(X \ge 6) = \sum_{i = 6}^{15} \binom{15}{i}0.3^i0.7^{15-i}\end{equation}

For more useful numbers, we can use R commands:

In [7]:
dbinom(6, 15, 0.3)

In [9]:
pbinom(5, 15, 0.3, lower.tail = F) # set lower.tail = F to get P(X > x)

# 3.4.12

The buses belong to two categories, defective and non-defective, and we select a random subset without replacement from them. Since $X$ is equal to the number of buses in the subset belonging to the category of defective buses, $X$ is a hypergeometric variable with $M_1 = 6, M_2 = 9, n = 5$.

The sample space is the set of possible numbers of buses in our subset that can be defective. Since $M_1 > n$ and $M_2 > n$, it is possible that all of the buses can be defective, or none of them can be defective. The sample space is therefore the set of all integers from 0 to $n$. The probability of selecting $k$ defective buses is equal to the number of ways to select $k$ defective buses, divided by the number of ways to select $n$ buses in general. This is

\begin{equation}
\frac{\binom{M_1}{k}\binom{M_2}{n-k}}{\binom{N}{n}} = \frac{\binom{6}{k}\binom{9}{5-k}}{\binom{15}{5}}
\end{equation}

The mean $E[X]$ is $n\frac{M_1}{N} = 5\frac{6}{6+9} = \frac{30}{15} = 2$, and the variance $V[X]$ is
\begin{align}
n\frac{M_1}{N}(1 - \frac{M_1}{N})\frac{N - n}{N- 1} &= 5\frac{6}{15}\frac{9}{15}\frac{10}{14} \\
&= \frac{6}{7}
\end{align}

Finally, we can use R to find $P(2 \le X \le 4)$:

In [12]:
phyper(4, 6, 9, 5) - phyper(1, 6, 9, 5)

# 3.4.18

We will assume that potholes on I80 occur uniformly at random with an average rate of 1.6 potholes per ten miles. (This is probably the least justified assumption we will be making today, but it's all we have to go on.) This means that the number of potholes in a 30-mile stretch of I80 is a Poisson random variable with parameter $\lambda = 3 \cdot 1.6 = 4.8$. Both the expected value and the variance are equal to $\lambda = 4.8$.

$Y = 5000X$. By linearity of expectation, $E[Y] = E[5000X] = 5000E[X] = 5000 \cdot 4.8 = \$24,000.$ The variance $V[Y]$ is $V[5000X] = 5000^2E[X] = 2.5 \cdot 10^7 \cdot 4.8 = 1.2 \cdot 10^8$.

\begin{equation}P(4 < X \le 9) = \sum_{k=5}^9 e^{-4.8}\frac{4.8^k}{k!}\end{equation}

For a more useful number, we use R:

In [11]:
ppois(9, 4.8) - ppois(4, 4.8)

# 3.4.19

The variance of a Poisson random variable is equal to the mean, so $V[X_1] = 2.6$ and $V[X_2] = 3.8$.

Let $A$ be the event that an article was handled by typesetter 1, and $B$ the event that an article has no errors. We have the following:
\begin{align}
P(A) &= 0.6 \\
P(B|A) &= P(X_1 = 0) = e^{-2.6} \\
P(B|A^c) &= P(X_2 = 0) = e^{-3.8} \\
\end{align}

We therefore have:
\begin{align}
P(B) &= P((B \cap A) \cup (B \cap A^c)) \\
&= P(B \cap A) + P(B \cap A^c) \\
&= P(B|A)P(A) + P(B|A^c)P(A^c) \\
&= 0.6e^{-2.6} + 0.4e^{-3.8}
\end{align}

In [13]:
0.6*exp(-2.6) + 0.4*exp(-3.8)

We can use Bayes's rule to solve part (c).

\begin{align}
P(A^c|B) &= \frac{P(B|A^c)P(A^c)}{P(B)} \\
&= \frac{0.4e^{-3.8}}{0.6e^{-2.6} + 0.4e^{-3.8}}
\end{align}

In [16]:
(0.4*exp(-3.8))/(0.6*exp(-2.6) + 0.4*exp(-3.8))

# 3.4.21

$Y$ is the number of defective components in the sample, so $M_1$ is equal to the number of defective components in the population, which is 300. $M_2$ is the number of non-defective components in the population, which is $10000 - 300 = 9700$. $n$ is the size of the sample, which is 200.

A hypergeometric distribution is a model of a sample drawn without replacement. The main difference between samples with replacement and samples without replacement is that in samples with replacement, it is possible to choose the same object more than once. However, in this case, because $n \ll N$, the expected number of components that we draw more than once in a sample with replacement is small. Therefore, the properties of a sample with replacement are quite similar to the properties of a sample without replacement. Taking a sample with replacement is equivalent to drawing each component independently at random, so we can approximate $Y$ with a binomial distribution with parameters $n = 200$ and $p = 300/10000 = 0.03$.

A Poisson distribution models the number of events that occur in a continuous space or time, where the expected number of events that occur in any fraction of the space or time is proportional to the area of that space or length of that time. If a binomial distribution has a large value for $n$, and a value for $p$ small enough that it is very unlikely that a Poisson process with mean $np$ would have more than one event occur in a region whose area is a $1/n$ fraction of the total, then the binomial distribution can be approximated with a Poisson distribution with parameter $\lambda = np$. Since in this case $np = 200 \cdot 0.03 = 6$ and $n = 200$, the binomial distribution satisfies these properties, and we can approximate $Y$ with a Poisson distribution with $\lambda = 6$.

In [17]:
phyper(10, 300, 9700, 200)

In [18]:
pbinom(10, 200, 0.03)

In [19]:
ppois(10, 6)

# 3.5.1

The mean of the exponential distribution is $6 = 1 / \lambda$, so $\lambda = 1/6$. The CDF of this distribution is $F(x) = 1 - e^{-1/6 \cdot x}$, so $P(X > 4) = 1 - P(X \le 4) = 1 - (1 - e^{-1/6 \cdot 4}) = e^{-2/3}$.

In [22]:
exp(-2/3)

In [23]:
pexp(4, 1/6, lower.tail = F)

The variance of an exponential distribution is $1/\lambda^2$, so the variance is 36. To find the 95th percentile, we set the CDF equal to .95:

\begin{align}
1 - e^{-1/6 \cdot x} &= .95 \\
e^{-1/6 \cdot x} &= .05 \\
x &= -6 \log .05
\end{align}

In [24]:
-6*log(.05)

In [25]:
qexp(.95, 1/6)

Since exponential distributions are memoryless, it does not matter how long the battery has been running. The conditional probability that the battery lasts an additional five years is equal to the probability that it lasts at least five years starting from the beginning of its life. This is simply $1 - (1 - e^{-1/6 \cdot 6}) = e^{-5/6}$.

The additional expected lifetime of the battery is equal to the expected lifetime of the battery unconditionally, which is just $1/\lambda = 6$ years.

In [26]:
exp(-5/6)

In [27]:
pexp(5, 1/6, lower.tail = F)

In [28]:
exp(-8/6) / exp(-3/6)

In [29]:
pexp(8, 1/6, lower.tail = F) / pexp(3, 1/6, lower.tail = F)

# 3.5.2

In a Poisson process with rate $r$, the "interarrival time" (the time between successive events) is an exponential random variable with parameter $\lambda = 1/r$. Let $X$ be the interarrival time. Then $X$ is an exponential random variable with parameter $\lambda = 1/1 = 1$.

Therefore, the probability that it will take between two and three weeks to get the first wrongly dialed phone call is $P(1/2 \le X \le 3/4) = (1 - e^{-3/4}) - (1 - e^{-1/2}) = e^{-1/2} - e^{-3/4}$.

In [30]:
exp(-1/2) - exp(-3/4)

Since $X$ is an exponential random variable, it is memoryless, so it does not matter that we have not received a wrong number in two weeks. The expected value and variance of the time until the next wrong number is simply $1/\lambda = 1$ month and $1/\lambda^2 = 1$.

# 3.5.5

Let $X$ be the yield strength of the steel. To find the 25th percentile, we can simply use the qnorm() function.

In [31]:
qnorm(.25, 43, 4.5)

We can also find the 25th percentile of the standard normal random variable and then transform the result.

In [32]:
qnorm(.25)*4.5 + 43

"Separates the strongest 10% from the others" means "90th percentile".

In [33]:
qnorm(.9, 43, 4.5)

In [34]:
qnorm(.9)*4.5 + 43

We want to include 99% of all strength values in our interval. Equivalently, we wish to exclude 1% of all strength values from our interval. Since the interval is symmetric about the mean, we want .5% of strength values to be greater than our interval, and .5% to be less than our interval. If we find the 99.5th percentile and then subtract the mean, we can find the radius of our interval.

In [35]:
qnorm(.995, 43, 4.5) - 43

Therefore, $c \approx 11.59$, and the "99% confidence interval" is approximately $(31.41, 54.59)$.

Let $Y$ be the number of A36 steels with strength less than 43. For each steel in the sample of 15, the probability that it has strength less than 43 is independent and identical to all the others, so $Y$ is a binomial random variable. We know the parameter $n$ is 15, but what is $p$? Since the strength of a steel is normally distributed, its probability distribution is symmetric about the mean 43. Therefore, the probability of the strength being less than 43 is exactly $0.5$. Thus, $p = 0.5$. We therefore want the probability that a binomial random variable with $n = 15, p = 0.5$ is less than or equal to 3.

In [36]:
pbinom(3, 15, 0.5)

# 3.5.7

We want $P(8.6 \le X \le 9.8)$. We can find this directly with pnorm().

In [37]:
pnorm(9.8, 9, 0.4) - pnorm(8.6, 9, 0.4)

We can also transform the standard normal random variable.

In [38]:
pnorm((9.8 - 9)/0.4) - pnorm((8.6 - 9)/0.4)

If we randomly and independently select four resistors, the number that are acceptable is a binomial random variable with $n = 4, p \approx 0.82$.

In [39]:
dbinom(2, 4, pnorm(9.8, 9, 0.4) - pnorm(8.6, 9, 0.4))

# 4.2.1a

\begin{align}
P(X > 1, Y > 2) &= P(X = 2, Y = 3) + P(X = 3, Y = 3) \\
&= 0.11 + 0.09 = 0.20
\end{align}

We could solve for $P(X > 1 \text{ or } Y > 2)$ directly, but it is easier to do complementary counting here.
\begin{align}
P(X > 1 \text{ or } Y > 2) &= 1 - P(X \le 1, Y \le 2) \\
&= 1 - (P(X = 1, Y = 1) + P(X = 1, Y = 2)) \\
&= 1 - (0.09 + 0.12) \\
&= 1 - 0.21 = 0.79
\end{align}

\begin{equation}
P(X > 2, Y > 2) = P(X = 3, Y = 3) = 0.09
\end{equation}

# 4.2.1b

To find the marginal PDFs of each variable, we sum over all possible values of the other variable.
$P(X = 1) = P(X = 1, Y = 1) + P(X = 1, Y = 2) + P(X = 1, Y = 3) = 0.09 + 0.12 + 0.13 = 0.34$
$P(X = 2) = P(X = 2, Y = 1) + P(X = 2, Y = 2) + P(X = 2, Y = 3) = 0.12 + 0.11 + 0.11 = 0.34$
$P(X = 3) = P(X = 3, Y = 1) + P(X = 3, Y = 2) + P(X = 3, Y = 3) = 0.13 + 0.10 + 0.09 = 0.32$

$P(Y = 1) = P(X = 1, Y = 1) + P(X = 2, Y = 1) + P(X = 3, Y = 1) = 0.09 + 0.12 + 0.13 = 0.34$
$P(Y = 2) = P(X = 1, Y = 2) + P(X = 2, Y = 2) + P(X = 3, Y = 2) = 0.12 + 0.11 + 0.10 = 0.33$
$P(Y = 3) = P(X = 1, Y = 3) + P(X = 2, Y = 3) + P(X = 3, Y = 3) = 0.13 + 0.11 + 0.09 = 0.33$

# 4.2.3

\begin{align}
P(X \le 10, Y \le 2) &= P(X = 8, Y = 1) + P(X = 8, Y = 2) + P(X = 10, Y = 1) + P(X = 10, Y = 2) \\
&= 0.3 + 0.12 + 0.15 + 0.135 = 0.605
\end{align}

\begin{align}
P(X \le 10, Y = 2) &= P(X = 8, Y = 2) + P(X = 10, Y = 2) \\
&= 0.12 + 0.135 = 0.255
\end{align}

$P(X = 8) = 0.3 + 0.12 = 0.42$
$P(X = 10) = 0.15 + 0.135 + 0.025 = 0.31$
$P(X = 12) = 0.03 + 0.15 + 0.09 = 0.27$

$P(Y = 1.5) = 0.3 + 0.15 + 0.03 = 0.48$
$P(Y = 2) = 0.12 + 0.135 + 0.15 = 0.405$
$P(Y = 2.5) = 0 + 0.025 + 0.09 = 0.115$

$P(X \le 10 | Y = 2) = P(X \le 10, Y = 2) / P(Y = 2) = 0.255 / 0.405 \approx 0.63$

# 4.2.4

|     |        |      | $y$  |      |
|-----|--------|------|------|------|
|     | F(x,y) | 1    | 2    | 3    |
|     | 1      | 0.09 | 0.21 | 0.34 |
| $x$ | 2      | 0.21 | 0.44 | 0.68 |
|     | 3      | 0.34 | 0.67 | 1.00 |

$F_X(1) = F(1, \infty) = F(1, 3) = 0.34$  
$F_X(2) = F(2, \infty) = F(2, 3) = 0.68$  
$F_X(3) = F(3, \infty) = F(3, 3) = 1.00$  

$F_Y(1) = F(\infty, 1) = F(3, 1) = 0.34$  
$F_Y(2) = F(\infty, 2) = F(3, 2) = 0.67$  
$F_Y(3) = F(\infty, 3) = F(3, 3) = 1.00$  

$P(X = 2, Y = 2) = F(2,2) - F(2,1) - F(1,2) + F(1,1) = 0.44 - 0.21 - 0.21 + 0.09 = 0.11$
This matches the PMF in Excercise 1.