In [1]:
import numpy as np
import scipy
from scipy.stats import binom

![](http://www.stat.yale.edu/Courses/1997-98/101/binpdf.gif)

### Binomial Distribution

A Survey found that 65% of all financial consumers were very satisfied with their primary financial institution. Suppose that 25 financial consumers are sampled and if survey result still holds true today, what is the probability that exactly 19 are very satisfied with their primary financial institution?

In [2]:
print(binom.pmf(k=19,n=25,p=0.65))  #Probability Mass Function(PMF)

0.09077799859322791


According to U.S Census Bureau approximately 6% of all workers in Jackson, Mississippi are unemployed in conducting a random telephone survey in Jackson. what is the probability of getting two or fewer unemployed workers in a sample of 20?

In [3]:
binom.cdf(2,20,0.06) #Cummulative Distribution Function

0.8850275957378545

Solve the Binomial Probability for n=20, p=40 and x=10

In [4]:
print(binom.pmf(k=10,n=20,p=0.4))

0.11714155053639011


### Poisson Distribution

In [5]:
from scipy.stats import poisson

Suppose bank customers arrive randomly on weekday afternnons at an average of 3.2 customers every 4 minutes. what is the probability of exactly 5 customers arriving in a 4 minute interval on a weekday afternoon

In [6]:
poisson.pmf(5,3.2)  #Probability Mass Function

0.11397938346351824

Suppose bank customers arrive randomly on weekday afternnons at an average of 3.2 customers every 4 minutes. what is the probability of having more than 7 customers arriving in a 4 minute interval on a weekday afternoon

In [7]:
a=poisson.cdf(7,3.2)
a

0.9831701582510425

In [8]:
b=1-a #b is probability of more than 7
b

0.01682984174895752

A Bank has an average random arrivalrate of 3.2 customers every 4 minutes. What is probability of getting exactly 10 customers during an 8 minute interval?

In [9]:
poisson.pmf(10,6.4)

0.052790043854115495

### Uniform Distribution

Suppose the amount of time it takes to assemble a plastic module ranges from 27 to 39 seconds and that assembly times are uniformly distributed. Describe the distribution. what is probability that a given assembly will take between 30 and 35 seconds?

In [10]:
a=np.arange(27,40,1)
a

array([27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39])

In [11]:
from scipy.stats import uniform

In [12]:
uniform.mean(loc=27,scale=12)

33.0

In [13]:
uniform.cdf(np.arange(30,36,1),loc=27,scale=12) # Cumulative Distribution Function

array([0.25      , 0.33333333, 0.41666667, 0.5       , 0.58333333,
       0.66666667])

In [14]:
probability=0.66666667-0.25 #Probability between 30 and 35
probability

0.41666667

According to the National Association of Insurance Commisioners, the average annual cost for automobile insurance in the United States in a recent year was $691. Suppose automobile insurance costs are uniformly distributed in the united states with a range of from $200 to $1,182. what is the standard deviation of this uniform distribution

In [15]:
uniform.mean(loc=200,scale=982) #MEAN

691.0

In [16]:
uniform.std(loc=200,scale=982) #STANDARD DEVIATION

283.4789821721062

### Normal Distribution

![](https://ds055uzetaobb.cloudfront.net/brioche/uploads/enBFdMBLyU-basic-normal-distribution.png?width=1200)

In [17]:
from scipy.stats import norm

In [18]:
value,mean,sd=68,65.5,2.5

In [19]:
norm.cdf(value,mean,sd) #Cummulative Distribution Function upto 68

0.8413447460685429

cdf(x>value)

In [20]:
1-norm.cdf(value,mean,sd) #Distribution above 68

0.15865525393145707

cdf(value1<x<value2)

In [21]:
norm.cdf(value,mean,sd)-norm.cdf(63,mean,sd) #Distribution between 68 and 63

0.6826894921370859

What is the probability of obtaining a score greater than 700 on a GMAT test that has a mean of 494 and a standard deviation of 100? Assume GMAT scores are normally distributed

p(x>700|mean=494 and sd=100)=?

In [22]:
1-norm.cdf(700,494,100)

0.019699270409376912

For the same GMAT examination, what is the probaility of randomly drawing a score that is 550 or less?

In [23]:
norm.cdf(550,494,100)

0.712260281150973

For the same GMAT examination, what is the probaility of randomly drawing a score between 300 and 600 on the GMAT exam?

In [24]:
norm.cdf(600,494,100)-norm.cdf(300,494,100)

0.8292378553956377

What is the probability of getting a score between 350 and 450 on the same GMAT exam?

In [25]:
norm.cdf(450,494,100)-norm.cdf(350,494,100)

0.2550348541262666

IF the probability is given and to find x value as follows

In [26]:
norm.ppf(0.95) #distribution under 0.95(probability)

1.6448536269514722

In [27]:
norm.ppf(1-0.6772) #To find left side area

-0.45988328292440145

### Hypergeometric Distribution 

Suppose 18 major computer companies operate in the United States and that 12 are located in California's Silicon Valley. If three computer companies are selected randomly from the entire list. what is the probability that one or more of the selected companies are located in the silicon valley

In [28]:
from scipy.stats import hypergeom
pval=hypergeom.sf(0,18,3,12) #Survival function sf=1-cdf
pval

0.9754901960784306

A western city has 18 police officers eligible for promotion. Eleven of the 18 are Hispanic. Suppose only five of the police officers are choosen from promotion. If the officers choosen for promotion had been selected by chance alone, what is the probability that one or fewer of the five promoted officers would have been Hispanic?

In [29]:
p=hypergeom.cdf(1,18,5,11)
p

0.04738562091503275

### Exponential Distribution

A manufacturing firm has been involved in statistical quality control for several years. As part of the production process parts are randomly selected and tested. From the records of these tests. It has been established that a defective part occurs in a pattern that is Poisson distributed on the average of 1.38 defects every 20 minutes during production runs. Use this information to determine the probability that less than 15 minutes will elapse between any two defects

In [30]:
mean=1/1.38  # for 20 miutes #mean of exponential distribution is reciprocal of mean of poisson distribution
mean

0.7246376811594204

In [31]:
from scipy.stats import expon

In [32]:
expon.cdf(0.75,0,(1/1.38))

0.6447736190750485