In [4]:
import numpy as np
import pandas as pd
from scipy.stats import binom

![](http://www.stat.yale.edu/Courses/1997-98/101/binpdf.gif)

Q.1) A survey found that 65% of all financial consumers were very statisfied with their primary financial institution. Suppose that        25 financial consumers are sampled and if survey result still holds true today, what is the probability that exactly 19 are very      satisfied with their primary financial institution?

In [5]:
# we will use probability mass function here, nCk.(p)^k.(1-p)^(n-k)
print(binom.pmf(k=19, n=25, p=0.65))

0.09077799859322791


The value of p is .65 (very satisfied), the value of q = 1 - p = 1 - .65 = .35 (not very
satisfied), n = 25, and x = 19. The binomial formula yields the final answer.

If 65% of all financial consumers are very satisfied, about 9.08% of the time the
researcher would get exactly 19 out of 25 financial consumers who are very satisfied
with their financial institution. How many very satisfied consumers would one expect to
get in 25 randomly selected financial consumers? If 65% of the financial consumers are
very satisfied with their primary financial institution, one would expect to get about 65%
of 25 or (.65)(25) = 16.25 very satisfied financial consumers. While in any individual sam-
ple of 25 the number of financial consumers who are very satisfied cannot be 16.25, busi-
ness researchers understand the x values near 16.25 are the most likely occurrences.

Q.2) According to the U.S. Census Bureau, approximately 6% of all workers in Jackson,
Mississippi, are unemployed. In conducting a random telephone survey in Jackson,
what is the probability of getting two or fewer unemployed workers in a sample of 20?

In [6]:
# we will use cumulative distribution to solve
print(binom.cdf(2, 20, 0.06))

0.8850275957378545


This problem must be worked as the union of three problems: (1) zero unemployed,
x = 0; (2) one unemployed, x = 1; and (3) two unemployed, x = 2. In each problem,
p = .06, q = .94, and n = 20. The binomial formula gives the following result.

Solve the binomial probability for n=20 p=40 and x=10

In [7]:
print(binom.pmf(k=10, n=20, p=0.40))

0.11714155053639011


# Poisson Distribution

In [8]:
from scipy.stats import poisson

In [9]:
poisson.pmf(3,2) # x=3 , mean = 2

0.18044704431548356

Q1.) Suppose bank customers arrive randomly on weekday afternoons at an average of 3.2
customers every 4 minutes. What is the probability of exactly 5 customers arriving in a
4-minute interval on a weekday afternoon?

In [10]:
poisson.pmf(5,3.2)

0.11397938346351824

Q.2)Bank customers arrive randomly on weekday afternoons at an average of 3.2 cus-
tomers every 4 minutes. What is the probability of having more than 7 customers in
a 4-minute interval on a weekday afternoon?

In [11]:
prob=poisson.cdf(7, 3.2) # probablity upto 7
1-prob

0.01682984174895752

Q.3)A bank has an average random arrival rate of 3.2 customers every 4 minutes. What
is the probability of getting exactly 10 customers during an 8-minute interval?

In [12]:
poisson.pmf(10, 6.4)

0.052790043854115495

# Uniform Distribution

Q.1) Suppose the amount of time it takes to assemble a plastic module ranges from 27 to
39 seconds and that assembly times are uniformly distributed. Describe the distribution.
What is the probability that a given assembly will take between 30 and 35 seconds?
Fewer than 30 seconds?

In [13]:
mean = (27+39)/2
mean

33.0

In [14]:
from scipy.stats import uniform

In [15]:
uniform.cdf(np.arange(30, 36, 1), loc=27, scale=12)

array([0.25      , 0.33333333, 0.41666667, 0.5       , 0.58333333,
       0.66666667])

In [16]:
0.6666667 - 0.25

0.41666669999999995

Q.2) According to the National Association of Insurance Commissioners, the average
annual cost for automobile insurance in the United States in a recent year was 691 dollar.
Suppose automobile insurance costs are uniformly distributed in the United States
with a range of from 200 dollar to 1,182 dollar. What is the standard deviation of this uniform dis-
tribution?

In [19]:
# sigma = (b-a)/((12)^(1/2))
uniform.mean(loc=200, scale=982)

691.0

In [20]:
uniform.std(loc=200, scale=982)

283.4789821721062

# Normal Distribution

![](https://miro.medium.com/max/24000/1*IdGgdrY_n_9_YfkaCh-dag.png)

In [21]:
from scipy.stats import norm

In [22]:
val, mean, stand_dev = 68, 65.5, 2.5

norm.cdf(val, mean, stand_dev)

0.8413447460685429

cdf(x > val)

In [23]:
1 - norm.cdf(val, mean, stand_dev)

0.15865525393145707

cdf(val1 < x < val2) 

In [24]:
norm.cdf(val, mean, stand_dev) - norm.cdf(63, mean, stand_dev)

0.6826894921370859

Q.1) What is the probability of obtaining a score greater than 700 on a GMAT test that has
a mean of 494 and a standard deviation of 100? Assume GMAT scores are normally
distributed.

P (x>700 |mu = 494 and sigma = 100) = ?

In [25]:
val , mean, stand_dev = 700, 494, 100
1 - norm.cdf(val, mean, stand_dev)

0.019699270409376912

Q.2) For the same GMAT examination, what is the probability of randomly drawing a
score that is 550 or less?

In [26]:
norm.cdf(550, mean, stand_dev)

0.712260281150973

Q.3) What is the probability of randomly obtaining a score between 300 and 600 on the
GMAT exam?

In [28]:
norm.cdf(600, mean, stand_dev) - norm.cdf(300, mean, stand_dev)

0.8292378553956377

Q.4) What is the probability of getting a score between 350 and 450 on the same GMAT
exam?

In [29]:
norm.cdf(450, mean, stand_dev) - norm.cdf(350, mean, stand_dev)

0.2550348541262666

In [33]:
norm.ppf(0.95)

1.6448536269514722

In [34]:
norm.ppf(1-0.95)

-1.6448536269514722

# Hyper Geometric Distribution

Q.1) Suppose 18 major computer companies operate in the United States and that 12 are
located in California’s Silicon Valley. If three computer companies are selected ran-
domly from the entire list, what is the probability that one or more of the selected
companies are located in the Silicon Valley?

In [36]:
from scipy.stats import hypergeom

1-hypergeom.cdf(0,18,3,12)

0.9754901960784313

Q.2) A western city has 18 police officers eligible for promotion. Eleven of the 18 are
Hispanic. Suppose only five of the police officers are chosen for promotion . If the officers chosen for promotion had been selected by chance
alone, what is the probability that one or fewer of the five promoted officers would
have been Hispanic?

In [37]:
hypergeom.cdf(1,18,5,11)

0.04738562091503275

# Exponential Probability Distribution

Q.1) A manufacturing firm has been involved in statistical quality control for several years.
As part of the production process, parts are randomly selected and tested. From the
records of these tests, it has been established that a defective part occurs in a pattern
that is Poisson distributed on the average of 1.38 defects every 20 minutes during
production runs. Use this information to determine the probability that less than
15 minutes will elapse between any two defects

In [38]:
mu = 1/1.38

In [39]:
from scipy.stats import expon

In [40]:
expon.cdf(0.75, 0, mu) # 15/20 = 0.75, loc=0

0.6447736190750485