# <font color=green>Binomial distribution </font>
***

## <font color = red> Problem </font>
***

In a competition to fill a vacancy for data scientist we have a total of ** 10 questions ** of multiple choice with ** 3 possible alternatives ** in each question. ** Each question has the same value. ** Suppose a candidate decides to venture out without having studied anything. He decides to take the blindfold test and guess all the answers. Assuming that the test ** is worth 10 points and the cut score is 5 **, obtain the probability of this candidate ** getting 5 questions ** and also the probability of this candidate ** going to the next stage of the selection process **.

## <font color = green> 2.1 Binomial Distribution </font>
***

A ** binomial ** event is characterized by the possibility of only two categories occurring. These categories together represent the entire sample space, and are also mutually exclusive, that is, the occurrence of one implies the non-occurrence of the other.

In statistical analysis the most common use of the binomial distribution is in solving problems involving ** success ** and ** failure ** situations.

# $$ P (k) = \binom{n}{k} p^k q^{n-k}$$

Where:

$ p $ = probability of success

$ q = (1 - p) $ = probability of failure

$ n $ = number of events studied

$ k $ = number of desired events that are successful

### Binomial Experiment

Realization of $n$ identical tests.

The tests are independent.

Only two results are possible, for example: True or false; Heads or tails; Success or failure.

The probability of success is represented by $p$ and failure by $1-p=q$. These probabilities do not change from trial to trial.

### Average binomial distribution

The expected value or the mean of the binomial distribution is equal to the number of experiments performed multiplied by the chance of the event occurring.

# $$\mu = n\times p $$

### Standard deviation of the binomial distribution

The standard deviation is the product between the number of experiments, the probability of success and the probability of failure.

# $$\sigma = \sqrt{n \times p \times q}$$

### Importing libraries
https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.comb.html

In [2]:
from sci py.special import comb

### Combinations

Number of combinations of $n$ objects, taking $k$ each time, is:

# $$C_{k}^{n} = \binom{n}{k} = \frac{n!}{k!(n - k)!}$$

Where

## $$n! = n\times(n-1)\times(n-2)\times...\times(2)\times(1)$$
## $$k! = k\times(k-1)\times(k-2)\times...\times(2)\times(1)$$

By definition

## $$0! = 1$$

## <font color = 'blue'> Example: Mega Sena (Lottery) </font>

On a Mega Sena lottery wheel we have a total of ** 60 numbers ** to choose from where the minimum bet is ** six numbers **. You who are curious decide to calculate the probability of hitting the Mega Sena with just ** a game **. For this we need to know how many ** combinations of six numbers can be formed with the 60 available numbers **.

### $$ C_ {6} ^ {60} = \binom{60} {6} = \frac {60!}{6! (60 - 6)!}$$

In [3]:
combinations = comb(60, 6)
combinations

50063860.0

In [7]:
probability = 1 / combinations
print('%0.15f' % probability)

0.000000019974489


## <font color = 'blue'> Example: Data scientist contest </font>

In a competition to fill a vacancy for data scientist we have a total of ** 10 questions ** of multiple choice with ** 3 possible alternatives ** in each question. ** Each question has the same value. ** Suppose a candidate decides to venture out without having studied anything. He decides to take the blindfold test and guess all the answers. Assuming that the test ** is worth 10 points and the cut score is 5 **, obtain the probability of this candidate ** getting 5 questions ** and also the probability of this candidate ** going to the next stage of the selection process **.

### What is the number of trials ($ n $)?

In [10]:
n = 10
n

10

### Are the trials independent?

Yes. The option chosen in one question has no influence on the option chosen in another question.

### Are only two results possible in each trial?

Yes. The candidate has two possibilities, HIT or WRONG a question.

### What is the probability of success ($ p $)?

In [11]:
number_of_alternatives_per_question = 3
p = 1 / number_of_alternatives_per_question
p

0.3333333333333333

### What is the probability of failure ($q$)?

In [13]:
q = 1 - p
q

0.6666666666666667

### What is the total number of events that you want to be successful ($ k $)?

In [14]:
k  = 5
k

5

### Solution 1

In [15]:
probability = (comb(n, k) * (p ** k) * (q ** (n-k)))
print('%0.8f' % probability)

0.13656455


### Importing libraries
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.binom.html

In [16]:
from scipy.stats import binom

### Solution 2

In [17]:
probability = binom.pmf(k, n, p)
print('%0.8f' % probability)

0.13656455


### Obtain the candidate's probability of passing

### $$P(hit \geq 5) = P(5) + P(6) + P(7) + P(8) + P(9) + P10)$$

In [22]:
binom.pmf([5,6,7,8,9,10], n, p).sum() # Probability mass function


0.21312808006909525

In [23]:
1 - binom.cdf(4, n, p) # Cumulative distribution function

0.21312808006909512

In [24]:
binom.sf(4, n, p) # 1 - binom.cdf(), survival function

0.21312808006909517

### A perfectly balanced coin is thrown into the air four times. Using the binomial distribution, obtain the probability that the coin will fall with the crown face up twice.

In [37]:
p = 1 / 2 # Probability of crown face
n = 4 # total of throws
k = 2 # total of success

binom.pmf(k, n, p)

0.3750000000000001

### A perfectly balanced dice is thrown upwards ten times. Using the binomial distribution, obtain the probability that the dice will fall with the number five facing upwards at least three times.

In [60]:
p = 1/6
n = 10
k = 3

(binom.sf(2, n, p) * 100).round(2)

22.48

## <font color = 'blue'> Example: Gymkhana </font>

A city in holds a gymkhana every year to raise funds for the city's hospital. In the last competition, it is known that the ** proportion of female participants was 60% **. ** The total number of teams, with 12 women members, registered in this year's gymkhana is 30 **. With the information above, answer: How many teams should be formed by ** 8 women **?

Solution

In [68]:
p = 0.6
n = 12
k = 8

probability = binom.pmf(k, n, p)

In [69]:
teams = 30 * probability
teams

6.385228185599988

Suppose the probability of a couple having children with blue eyes is 22%. In 50 families, with 3 children each, how many can we expect to have two children with blue eyes?

In [74]:
p = 0.22
n = 3
k = 2

probability = binom.pmf(k, n, p)

m = 50 * pr
m

5.662799999999999