## The Urn Model

Was developed was Jacob Bernoulli to model the process of selecting items from a population.

To set up an urn model, we first need to decide on: 
    
- The number of marbles in the urn
- The color (or label) on each marble
- The number of marbles to darw from the urn
- The drawing / sampling process (with replacement or without replacement)

We can simulate the draw of two marbles from the urn without replacement.

In [8]:
import numpy as np
urn = ["b","b","b","w","w"]

print("Sample 1: ",np.random.choice(urn, size=2, replace=False))
print("Sample 2: ", np.random.choice(urn, size = 2, replace=False))

Sample 1:  ['w' 'w']
Sample 2:  ['b' 'b']


#### Questions: 

- What is the chance that our sample contais marbles of only one color ?
- Does the chance change if we return each marble after slecting it ?
- What if we changed the number of marbles in the urn ?
- What if we draw more marbles from the urn ?
- What if we repat the process many times ?

This way of simulation can be easily applicable to the real world problems easily. 

For example, we can use simulation to easily estimate the fraction of samples where both marbles that we draw match in color.

In [12]:
n = 10000
samples = [np.random.choice(urn, size=2, replace=False) for _ in range(n)]
is_matching = [marble1 == marble2 for marble1, marble2 in samples]

print(f"Proportion of samples with matching marbles : {np.mean(is_matching)}")

Proportion of samples with matching marbles : 0.4004


The urn model, where we draw samples without replacement is what is known as the simple random sampling.

In [15]:
from itertools import combinations, permutations

all_samples = ["".join(sample) for sample in combinations("ABCDEFG",3)]
print(all_samples)

print("Number of samples:", len(all_samples))

['ABC', 'ABD', 'ABE', 'ABF', 'ABG', 'ACD', 'ACE', 'ACF', 'ACG', 'ADE', 'ADF', 'ADG', 'AEF', 'AEG', 'AFG', 'BCD', 'BCE', 'BCF', 'BCG', 'BDE', 'BDF', 'BDG', 'BEF', 'BEG', 'BFG', 'CDE', 'CDF', 'CDG', 'CEF', 'CEG', 'CFG', 'DEF', 'DEG', 'DFG', 'EFG']
Number of samples: 35


In [16]:
print(["".join(sample) for sample in permutations("ABC")])

['ABC', 'ACB', 'BAC', 'BCA', 'CAB', 'CBA']
