# Computing probabilities in Python with "scipy"

# Discrete distributions

## PMF:

How much probability I will get?

### Binomial distribution

In [1]:
from scipy.stats import binom

#binom.pmf(succesful_trials,trials,probability_of_success)
# pmf = probability mass function
#P(x>1) = P(x=2)+P(x=3)...+P(x=5)
P_more_than_1 = sum([binom.pmf(x,5,0.5) for x in range(2,6)])
P_more_than_1
# #P(x>1) = 1- ((Px=0)+P(x=1))



0.8125

### Poisson distribution

In [19]:
from scipy.stats import poisson

# P(x < value) = P(x = 0) + P(x = 1) + .... P(x = value -1) 
#poisson.cdf(counts, mean)
# poisson.cdf(10, 7) = p(counts = 0, mean = 7) + p(counts = 1, mean = 7) + p(counts = 2, mean = 7) + p(counts = 3, mean = 7)
# +....p(counts = 10, mean = 7)
poisson.cdf(10, 7)

0.9014792058890873

# Continuous distributions

# CDF: 

How much probability I will get if I add the probabilities until a given value?

For discrete:
$$P(x < value) = \sum_{values< value} P(x = value)$$

For continuous:

$$P(x < value) = ∫_{min_value}^{value} P(x = value )dx$$

If our distribution is continuous, it doesn't make sense to ask about the probability of getting one particular value. In contrast we can ourselfves what is the probability of getting values below a given one: $P(x < value)$ which is the area under the distribution until value.

### Uniform distribution

In [10]:
from scipy.stats import uniform

# P(x < value)?
# P(x < value) = \int_{min_value}^{value} P(x = value) dx
# uniform.cdf(value,min_value,max_value) 
uniform.cdf(360/20,  0,          360) # cdf = continuous distribution fuction

0.05

### Normal distribution

In [11]:
from scipy.stats import norm

# P(x <= value)     
#norm.cdf(value, mean, sd)
norm.cdf(110.05, 112, 9)

0.4142340635304751

### Exponential distribution

In [12]:
from scipy.stats import expon

# P(x <= value)
# expon.cdf(value,scale = lambda)
expon.cdf(2, 1)

0.6321205588285577

### Chi2 distribution

In [13]:
from scipy.stats import chi2

# P(x <= value)
# chi2.cdf(value,degrees_of_freedom)
chi2.cdf(3,5)

0.3000141641213724

### t-student Distribution

In [16]:
from scipy.stats import t

# P(x <= value)
# t.cdf(value, degrees_of_freedom)
t.cdf(0,40)

0.5

# Activity

* Can you guess why the previous cdf is 0.5?

* Increase the degrees of freedom by 3 and compute the new cdf.

* Set the number of degrees of freedom to 40 and re-compute the t.cdf

* Compare the previous value against the norm.cdf(0, 0, 1). 

# PPF

Now the question is up to which value I need to add probabilities to get a given amount of probability?

In other words, solve the following question:

$$P(x < value) = probability $$

what is the "value" in the previous equation? 

For example, given the normal distribution what is the value that I need to plug in order to obtain a total probability of 0.5?

* P( x < value ) = 0.5


# Activity

* What is the "value" for the previous example?

**PPFs are the opposite functions of CDFs!!!**

## Binomial distribution

In [17]:
# binom.ppf(total_probability, number_of_trials, probability_of_succes)
binom.ppf(0.9,10,0.5)

7.0

## Poisson

In [20]:
# poisson.ppf(total_probability, mean)
poisson.ppf(0.7, 5)

6.0

## Uniform distribution

In [21]:
## uniform.ppf(total_probability, min_value, max_value)

uniform.ppf(0.6, 2, 7)

6.2

## Normal distribution

In [22]:
## norm.ppf(total_probability, mean, sd)

norm.ppf(0.8,0,1)

0.8416212335729143

In [None]:
norm.cdf(0.8416212335729143,0,1)

0.8

## Exponential distribution

In [23]:
## expon.ppf(total_probability, lambda)

expon.ppf(0.7, 2)

3.203972804325936

In [24]:
expon.cdf(3.203972804325936, 2)

0.7

## Chi2 distribution

In [25]:
## chi2.ppf(total_probability, degrees_of_freedom)
chi2.ppf(0.8, 10)

13.441957574973113

In [26]:
chi2.cdf(13.441957574973113,10)

0.8

## t-student Distribution

In [27]:
## t.ppf(total_probability, degrees_of_freedom)
t.ppf(0.95, 5)

2.015048372669157

In [28]:
t.cdf(2.015048372669157,5)

0.9499999999576474