# Funtion to compute probability
Let us start defining a function to compute the probability of an event, given a sample space of equally probable outcomes 

In [12]:
# function to compute the probability of an envent, given a sample space of
# of equiprobable outcomes 
from fractions import Fraction
def P(event,space):
    return Fraction(len(event & space),len(space))

The idea of this function is simply to compute the ratio of favorable case over the number of all possible cases. The intersection in the numerator is to avoid the case in which the set event contains items that do not belong to the sample space.
The result is represented as a fraction

# Warm-up Problem: Die Roll

What's the probability of rolling an even number with a single six-sided fair dice?

We can define the sample space D and the event "even", and compute the probability:

In [2]:
Omega = {1,2,3,4,5,6}
even = {2, 4, 6}
P(even,Omega)

Fraction(1, 2)

Let us demonstrate the utility to have an intersection in the numerator in the definition of probability. Let us redefine "even" as follows

In [3]:
even = {2, 4, 6, 8, 10, 12}
P(even,Omega)

Fraction(1, 2)

# Urn Problems
Aound 1700, the matematician Jacob Bernoulli wrote about removing colored balls from an urn in a landmark treatise. Ever since then, explanation of basic concepts in probability have relied on urn problems.  Here is an example, which we will solve using python.


## Problem 
> An urn contains 23 balls, 8 white, 6 blue, and 9 red. We select 6 balls at random (with each possible selection equally likely). What is the probability of each of these possible events? 
> 1. All balls are red 
> 2. 3 are blue, 2 are white and 1 is red  
> 3. exactly 4 of the balls are white

## Solution

We can solve this problem using the P function and some counting, which is a bit trickier than before because

- we have multiple balls of the same color
- an outcome is a set of balls where the order does not matter, and not a sequence, where the order matters

To count in the right way, we will label each white ball as 'W1', 'W2', ..., ;'W8'. Similarly, we will label the blue balls, as 'B1', ..., 'B6', and the red balls as 'R1',..., 'R9'. 

Let us define the content of the urn. We start by defining an auxiliary function that computes the set resulting from concatenating one item from collection A with one item from collection B

In [1]:
def cross(A,B):
    return {a+b for a in A for b in B}

In [2]:
urn = cross('W','12345678') | cross('B','123456') | cross('R','123456789')
urn

{'B1',
 'B2',
 'B3',
 'B4',
 'B5',
 'B6',
 'R1',
 'R2',
 'R3',
 'R4',
 'R5',
 'R6',
 'R7',
 'R8',
 'R9',
 'W1',
 'W2',
 'W3',
 'W4',
 'W5',
 'W6',
 'W7',
 'W8'}

In [3]:
len(urn)

23

We now define the sample space, which is given in our case by the set of all 6-ball combinations. Let us call this sample space U6. To generate U6 we use the function `itertools.combinations`, which returns all tuples of a given length, in sorted order, without repeated elements. Then we join each combination into a string

In [10]:
import itertools

# function  to compute all combinations of n items; each combination is returned in the form of a concatenated string
def combos(items,n):
    return {' '.join(combo) 
       for combo in itertools.combinations(items,n)}

U6 = combos(urn,6)

# this code is to check the actual output of itertool.combination it is a 2D object, this is the reason oof the conversion
# to string; 

# altlist=list(itertools.combinations(urn,2))
# display(altlist)


So the size of the sample space is indeed quite large. Let us just peek at few random samples 

In [6]:
import random 
# convert set into a list to avoid deprecation messages in python >=3.9
random.choices([*U6],k=10)

['R3 W8 B5 W3 R7 B2',
 'R3 W2 R4 W8 W6 W5',
 'R3 B3 W3 R7 B2 W5',
 'R3 R9 R4 W8 B1 W5',
 'W1 R9 R4 B3 B6 R5',
 'W1 R4 W8 W6 W4 W5',
 'W1 R2 W2 B3 B5 B4',
 'R2 W2 W8 R8 B4 R1',
 'W8 B6 W7 W6 R7 W5',
 'W1 R3 B6 B5 W3 W6']

Is the size of the sample space correct? In how many ways can we select 6 out of 23 items? 


In [7]:
from math import comb #comb is the binomial coefficient in python 
comb(23,6)

100947

We are now ready to solve the 3 problems

## Urn problem 1: what is the probability of selecting 6 red balls?


In [33]:
red6 = {s for s in U6 if s.count('R')==6}
P(red6,U6)

Fraction(4, 4807)

Let us check that this solution is correct. How many ways of getting 6 red balls are there?

In [34]:
len(red6)

84

why are there exactly so many ways? Because we have 9 red balls and we are asking in how many ways we can extract 6

In [35]:
comb(9,6)

84

So the probability of getting 6 red balls is just (9 choose 6) divided by the size of the sample space

In [36]:
P(red6,U6)==Fraction(comb(9,6), len(U6))

True

## Urn problem 2: what is the probability of 3 blue 2 white and 1 red?

In [13]:
b3w2r1= {s for s in U6 if s.count('B')==3 and s.count('W')==2 and s.count('R')==1}
P(b3w2r1,U6)

Fraction(240, 4807)

We can get the same answer by counting in how many ways we can choose 3 out of 6 blues, 2 out of 8 whites, and 1 out of 9 reds, and by dividing by the dimension of the sample space

In [38]:
P(b3w2r1,U6) == Fraction( comb(6,3) * comb(8,2) * comb(9,1), len(U6))

True

## Urn Problem 3: what is the probability of exactly 4 white balls?

In [39]:
w4 = {s for s in U6 if s.count('W') == 4}
P(w4,U6)

Fraction(350, 4807)

We can get the same answer by counting in how many ways we can select 4 out of 8 white balls and 2 out of the 15 nonwhite balls.

In [40]:
P(w4,U6) == Fraction(comb(8,4) * comb(15,2), len(U6))

True