<a href="https://colab.research.google.com/github/tombackert/ml-stuff/blob/main/intro_probability.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro Probability Theory

In [32]:
from fractions import Fraction

def P(event, space):
    "The probability of an event, given a sample space."
    return Fraction(cases(favorable(event, space)),
                    cases(space))

favorable = set.intersection # Outcomes that are in the event and in the sample space
cases     = len              # The number of cases is the length, or size, of a set

## Die Roll

In [33]:
D = {1, 2, 3, 4, 5, 6} # a sample space
even = {2, 4, 6}       # an event

P(even, D)

Fraction(1, 2)

In [34]:
prime = {2, 3, 5, 7, 11, 13}
odd   = {1, 3, 5, 7, 9, 11, 13}

In [35]:
P(odd, D)

Fraction(1, 2)

In [37]:
P((even | prime), D) # The probability of an even or prime die roll

Fraction(5, 6)

In [38]:
P((odd & prime), D) # The probability of an odd prime die roll

Fraction(1, 3)

## Card Problems

In [39]:
suits = u'♥♠♦♣'
ranks = u'AKQJT98765432'
deck  = [r + s for r in ranks for s in suits]
len(deck)

52

In [40]:
import itertools

def combos(items, n):
  "All combinations of n items; each combo as a space-seperated str."
  return set(map(' '.join, itertools.combinations(items, n)))

Hands = combos(deck, 5)
len(Hands)

2598960

In [41]:
import random
random.sample(Hands, 7)

since Python 3.9 and will be removed in a subsequent version.
  random.sample(Hands, 7)


['K♠ K♦ T♥ 7♦ 7♣',
 'K♣ 8♠ 8♣ 7♦ 4♣',
 'Q♥ Q♦ 8♣ 6♥ 5♣',
 'K♠ T♣ 9♥ 4♦ 4♣',
 '8♠ 7♥ 7♦ 3♦ 2♥',
 'T♥ T♠ 7♠ 6♦ 4♦',
 'A♣ Q♠ J♦ T♥ 2♠']

In [42]:
random.sample(deck, 7)

['T♥', '6♣', 'K♥', 'K♦', '5♥', '4♣', 'T♦']

In [43]:
flush = {hand for hand in Hands if any(hand.count(suit) == 5 for suit in suits)}

P(flush, Hands)

Fraction(33, 16660)

In [44]:
33/16660

0.0019807923169267707

In [46]:
four_kind = {hand for hand in Hands if any(hand.count(rank) == 4 for rank in ranks)}

P(four_kind, Hands)

Fraction(1, 4165)

In [47]:
1/4165

0.00024009603841536616

## Urn Problems


In [48]:
def balls(color, n):
    "A set of n numbered balls of the given color."
    return {color + str(i)
            for i in range(1, n + 1)}

urn = balls('B', 6) | balls('R', 9) | balls('W', 8)
urn

{'B1',
 'B2',
 'B3',
 'B4',
 'B5',
 'B6',
 'R1',
 'R2',
 'R3',
 'R4',
 'R5',
 'R6',
 'R7',
 'R8',
 'R9',
 'W1',
 'W2',
 'W3',
 'W4',
 'W5',
 'W6',
 'W7',
 'W8'}

In [49]:
len(urn)

23

In [50]:
U6 = combos(urn, 6)

random.sample(U6, 5)

since Python 3.9 and will be removed in a subsequent version.
  random.sample(U6, 5)


['R1 W4 R5 B3 B1 B4',
 'R1 R3 R7 R9 W2 R8',
 'W8 W1 R2 B1 B4 B6',
 'R6 W8 W4 R5 R9 B3',
 'R3 W5 R5 R2 B6 R8']

In [51]:
def select(color, n, space=U6):
    "The subset of the sample space with exactly `n` balls of given `color`."
    return {s for s in space if s.count(color) == n}

In [53]:
P(select('R', 6), U6)

Fraction(4, 4807)

In [57]:
4/4807

0.08321198252548366

In [58]:
P(select('B', 3) & select('R', 1) & select('W', 2), U6)

Fraction(240, 4807)

In [59]:
P(select('W', 4), U6)

Fraction(350, 4807)

## Urn Problems via arithmetic

In general, the number of ways of choosing c out of n items is (n choose c) = n! / ((n - c) * c!).

In [60]:
from math import factorial

def choose(n, c):
  "Number of ways to choose c items from a list of n items"
  return factorial(n) // (factorial(n - c) * factorial(c))

In [61]:
choose(9, 6)

84

P computes a ratio and choose computes a count. So we multiply the left-hand-side by N, the lenght of the sample space, to make both sides be counts.

In [62]:
N = len(U6)

N * P(select('R', 6), U6) == choose(9, 6)

True

In [63]:
N * P(select('B', 3) & select('W', 2) & select('R', 1), U6) == choose(6, 3) * choose(8, 2) * choose(9, 1)

True

In [64]:
N * P(select('W', 4), U6) == choose(8, 4) * choose(6 + 9, 2)
# (6 + 9 non-white balls)

True

## Non-Equiprobable Outcomes