# HYPERGEOMETRIC DISTRIBUTION

The hypergeometric distribution, like the binomial distribution, consists of two possible outcomes: success and failure. However, the user must know the size of the population and the proportion of successes and failures in the population to apply the hypergeometric distribution. In other words, because the hypergeometric distribution is used when
sampling is done without replacement, information about population makeup must be known in order to redetermine the probability of a success in each successive trial as the probability changes.

<h1>The hypergeometric distribution has the following characteristics:</h1>

■ It is discrete distribution.

■ Each outcome consists of either a success or a failure.

■ Sampling is done without replacement.

■ The population, N, is finite and known.

■ The number of successes in the population, A, is known.

<h2>Formula</h2>
$$P (x) = \frac {_{A} C_{x} .  _{N-A} C_{n-x}}
{_{N} C_{n}}$$

$$ \textrm{where}$$

$$ N = \textrm{size of the population}$$

$$n = \textrm{sample size}$$

$$A = \textrm{number of successes in the population}$$

$$ x = \textrm{number of successes in the sample; 
sampling is done without replacement}
$$

In [1]:
import numpy as np
from scipy.stats import hypergeom as hy

**Suppose 18 major computer companies operate in the United States and that 12 are
located in California’s Silicon Valley. If three computer companies are selected randomly
from the entire list, what is the probability that one or more of the selected
companies are located in the Silicon Valley?**

$$N=18$$
$$n=3$$
$$A=12$$
$$x>=1$$

In [3]:
pval=hy.sf(0,18,3,12) #hypergeom.sf(x-1,N,n,A) sf=a-cdf
print(pval)

0.9754901960784306


probability that one or more of the selected companies are located in the Silicon Valley is 97.55%

**A western city has 18 police officers eligible for promotion. Eleven of the 18 are Hispanic. Suppose only 5 of the police officers are chosen for promotion and that 1 is Hispanic. If the officers chosen for promotion had been selected by chance alone, what is the probability that 1 or fewer of the 5 promoted officers would have been Hispanic? What might this result indicate?**

$$N=18$$
$$n=5$$
$$A=11$$
$$x<= 1$$

In [4]:
pval=hy.cdf(1,18,5,11)
print(pval)

0.04738562091503275


robability that 1 or fewer of the 5 promoted officers would have been Hispanic is 4.73%

**Catalog Age lists the top 17 U.S. firms in annual catalog sales. Dell Computer is number one followed by IBM and W.W. Grainger. Of the 17 firms on the list, 8 are in some type of computer-related business. Suppose four firms are randomly selected.**

**a. What is the probability that none of the firms is in some type of computer-related business?**

**b. What is the probability that all four firms are in some type of computer-related business?**

**c. What is the probability that exactly two are in non-computer-related business?**

In [5]:
#a. What is the probability that none of the firms is in some type of computer-related business?
pval=hy.pmf(0,17,4,8)
print(pval)

0.05294117647058814


In [7]:
#b. What is the probability that all four firms are in some type of computer-related business?
prob=hy.pmf(4,17,4,8)
print(round(prob,5))

0.02941


In [11]:
#c. What is the probability that exactly two are in non-computer-related business?
prob=hy.pmf(2,17,4,9)
print(round(prob,4))

0.4235
