### Day 4 - Binomial distribution. Geometric distribution
________________________________________________

  <br/>

- [Background](#Background)
- [Task 1](#Task)
- [Task 2](#task2)
- [Task 3](#task3)
- [Task 4](#task4)

  <br/>

#### Background 

A binomial experiment (or Bernoulli trial) is a statistical experiment that has the following properties:


- The experiment consists of $ n $ repeated trials.
- The trials are independent.
- The outcome of each trial is either success ($ s $) or failure ($ f $).

We define a binomial process to be a binomial experiment meeting the following conditions:

- The number of successes is $ x $ .
- The total number of trials is $ n $.
- The probability of success of $ 1 $ trial is $ p $.
- The probability of failure of $ 1 $ trial is $ q $, where $ q = 1 - p $.
- $ b(x,n,p)$ is the binomial probability, meaning the probability of having exactly $ x $ successes out of $ n $ trials.

The binomial random variable is the number of successes, , out of  trials.


The binomial distribution is the probability distribution for the binomial random variable, given by the following probability mass function:

$$
b(x,n,p) = {n\choose x} ⋅p^x⋅q^{n-x}
$$

Recall that $ {n\choose x} = \frac{n!}{x!(n-x)!} $

__Cumulative Probability__

We consider the distribution function for some real-valued random variable, $ X $ , to be $ F_X(x) = P(X\leq x) $. Because this is a non-decreasing function that accumulates all the probabilities for the values of $ X $ up to (and including) $ x $ , we call it the cumulative distribution function (CDF) of $ X $. As the CDF expresses a cumulative range of values, we can use the following formula to find the cumulative probabilities for all $ x \in [a,b] $:

$$
P(a < X \leq b) = F_X(b) - F_X(a)
$$

[Full tutorial link on HackerRank](https://www.hackerrank.com/challenges/s10-binomial-distribution-1/tutorial)

A negative binomial experiment is a statistical experiment that has the following properties:

- The experiment consists of $ n $ repeated trials.
- The trials are independent.
- The outcome of each trial is either success ($ s $) or failure ($ f $).
- $ P(s) $ is the same for every trial.
- The experiment continues until $ x $ successes are observed.

The geometric distribution is a special case of the negative binomial distribution, where the number of successes is $ 1 $. We express this with the following formula:

$$
g(n,p) = q^{(n-1)}⋅p
$$

[Full tutorial link on HackerRank](https://www.hackerrank.com/challenges/s10-geometric-distribution-1/tutorial)


#### Task
The ratio of boys to girls for babies born in Russia is $ 1.09:1 $. If there is $ 1 $ child born per birth, what proportion of Russian families with exactly $ 6 $ children will have at least $ 3 $ boys?

Write a program to compute the answer using the above parameters (input them). Then print your result, rounded to a scale of 3 decimal places.

##### Input format
The ratio of boys to girls for babies in one line

##### Output format
Proportion of Russian families with exactly 6 children that have at least 3 boys rounded to a scale of 3 decimal places

In [1]:
import math
import operator as op
from functools import reduce

rb, rg = map(float,input().split())
p = rb/(rg+rb)
q = 1 - p

# n - total number of trials (6)
# x - number of successes (3)
# p - probability of success of one trial
# q - probability of failure of one trial
# number of combinations = n!/x!(n-x)!
# b(x,n,p) = [number of combinations] * p^x * q^(n-x)

def ncr(n,x):
    x=min(x,n-x)
    numer = reduce(op.mul, range(n, n-x, -1), 1)
    denom = reduce(op.mul, range(1, x+1), 1)
    return numer // denom

def nCr(n,r):
    f = math.factorial
    return f(n) / f(r) / f(n-r)

def binom_distr(n,x,p):
    return nCr(n,x) * p**x * (1-p)**(n-x)


print(f"{sum([binom_distr(6,n,p) for n in range(3,7)]):0.3f}")

1.09 1
0.696


#### Task <a name="task2" />

A manufacturer of metal pistons finds that, on average, __12%__ of the pistons they manufacture are rejected because they are incorrectly sized. What is the probability that a batch of __10__ pistons will contain:

- No more than 2 rejects? 
- At least 2 rejects?

#####  Input format

Estimation of rejected pistons, batch size on one line

##### Output format

Answers on questions 1 and 2 on separate lines

In [2]:
r,n=map(int,input().split())

p=r/100

def ncr(n,x):
    x=min(x,n-x)
    numer = reduce(op.mul, range(n, n-x, -1), 1)
    denom = reduce(op.mul, range(1, x+1), 1)
    return numer // denom

def nCr(n,r):
    f = math.factorial
    return f(n) / f(r) / f(n-r)

def binom_distr(n,x,p):
    return nCr(n,x) * p**x * (1-p)**(n-x)

print(f"{sum([binom_distr(n,x,p) for x in range(0,3)]):0.3f}")
print(f"{sum([binom_distr(n,x,p) for x in range(2,n+1)]):0.3f}")

12 10
0.891
0.342


#### Task <a name="task3" />
The probability that a machine produces a defective product is $ \frac{1}{3}$. What is the probability that the first defect occurs the fifth item produced?

##### Input Format

The first line contains the respective space-separated numerator and denominator for the probability of a defect, and the second line contains the inspection we want the probability of being the first defect for:

    1 3
    5

If you do not wish to read this information from stdin, you can hard-code it into your program.

##### Output Format

Print a single line denoting the answer, rounded to a scale of $ 3 $ decimal places

In [3]:
nom,denom = map(int,input().split())
n = int(input())
p = nom/denom
print (f"{(1-p)**(n-1) * p :0.3f}")

1 3
5
0.066


#### Task <a name="task4" />
The probability that a machine produces a defective product is $ \frac{1}{3}$. What is the probability that the first defect is found during the first 5 inspections?

##### Input Format

The first line contains the respective space-separated numerator and denominator for the probability of a defect, and the second line contains the inspection we want the probability of being the first defect for:

    1 3
    5

If you do not wish to read this information from stdin, you can hard-code it into your program.

##### Output Format

Print a single line denoting the answer, rounded to a scale of $ 3 $ decimal places

In [5]:
nom,denom = map(int,input().split())
n = int(input())
p = nom/denom
res = sum([(1-p)**(i-1) * p for i in range(1,n+1)])
print (f"{res :0.3f}")

1 3
5
0.868
