Exercise 3 (Sampling with and without replacement)

There is a finite set of $n_0$ objects, of which the number $d_0 < n_0$ are defective.

1. A sample of size $n < \min\{d_0, n_0 − d_0\}$ is chosen at random without replacement. For $d = 0, . . . , n$ what is the probability that exactly $d$ members of the sample are defective?

The probability is given by
\begin{equation}
\mathbb{P}(\text{$d$ defective})=\frac{{d_0\choose d}{n_0-d_0\choose n-d}}{{n_0\choose n}}. \square
\end{equation}

In [1]:
from scipy.special import binom
import random

In [2]:
# empirical check
# total number of objects
n0 = 100
# total number of defective objects
d0 = int(n0/5)
# choose sample size n<min{d0, n0-d0}
n = 10
objects = list(range(1, n0+1))
num_experiments = int(10**5)
counter = 0
# choose number in 0,...,n
d = 5
for experiment in range(num_experiments):
    num_defective = 0
    # sample without replacement
    sample = random.sample(objects, n)
    for i in sample:
        if i<=d0:
            num_defective+=1
    if num_defective==d:
        counter+=1
empirical_p = counter/num_experiments
print("Empirical p = {0}".format(empirical_p))
theoretical_p = binom(d0, d)*binom(n0-d0, n-d)/binom(n0, n)
print("Theoretical p = {0}".format(theoretical_p))

Empirical p = 0.02186
Theoretical p = 0.021531469960251768


2. What if the sample is chosen at random with replacement?

If the sample is chosen at random, then the probability is given by
\begin{equation}
p=\underbrace{{n\choose d}}_{p_1}\underbrace{\left(\frac{n_0-d_0}{n_0}\right)^{n-d}\left(\frac{d_0}{n_0}\right)^d}_{p_2}.
\end{equation}
$p_2$ is the probability that the sample is a sample with $d$ defective items, and $p_1$ is the number of such samples. $\square$

In [3]:
# empirical check
# total number of objects
n0 = 100
# total number of defective objects
d0 = int(n0/5)
# choose sample size n<min{d0, n0-d0}
n = 10
objects = list(range(1, n0+1))
num_experiments = int(10**5)
counter = 0
# choose number in 0,...,n
d = 5
for experiment in range(num_experiments):
    num_defective = 0
    # sample with replacement
    sample = random.choices(objects, k=n)
    for i in sample:
        if i<=d0:
            num_defective+=1
    if num_defective==d:
        counter+=1
empirical_p = counter/num_experiments
print("Empirical p = {0}".format(empirical_p))
theoretical_p = binom(n, d)*((n0-d0)/n0)**(n-d)*(d0/n0)**d
print("Theoretical p = {0}".format(theoretical_p))

Empirical p = 0.02688
Theoretical p = 0.02642411520000001


3. Compare these probabilities for the case $n_0 = 100, d_0 = 20, n = 10$.

See results of the code above the comparison. $\square$