# Section 1: Computing Probabilities using Python

This notebook demonstrates fundamental concepts in probability theory using Python. We'll explore:

1. Sample spaces and events
2. Computing probabilities for simple and complex events
3. Working with weighted probabilities
4. Solving real-world probability problems

We'll use Python's built-in data structures and the `itertools` library for efficient computations.

## Sample Space and Events

A ```sample space``` (of an experiment) is the set of all possible outcomes an action could produce. Consider the sample space of a coin flip.

In [1]:
sample_space = {'Heads','Tails'}

In [2]:
probability_heads = 1/len(sample_space)
print(f'Probability of landing heads is {probability_heads}')

Probability of landing heads is 0.5


```Events``` are subsets of sample space that satisfy some given condition.

In [3]:
def is_heads_or_tails(outcome): return outcome in sample_space
def is_neither(outcome): return not is_heads_or_tails(outcome)

In [4]:
def is_heads(outcome): return outcome == 'Heads'
def is_tails(outcome): return outcome == 'Tails'

In [5]:
def get_matching_events(event_condition, sample_space):
    return set([outcome for outcome in sample_space if event_condition(outcome)])

In [6]:
print(get_matching_events(is_heads, sample_space))

{'Heads'}


In [7]:
print(get_matching_events(is_neither, sample_space))

set()


In [8]:
def compute_probabilities(event_condition, sample_space):
    events = get_matching_events(event_condition, sample_space)
    return len(events)/len(sample_space)

In [18]:
event_conditions = [is_heads_or_tails, is_neither, is_heads, is_tails]
for event_condition in event_conditions:
    condition_name = event_condition.__name__
    prob = compute_probabilities(event_condition, sample_space)
    print(f"The probability of event arising from '{condition_name}' is {prob}!")

The probability of event arising from 'is_heads_or_tails' is 1.0!
The probability of event arising from 'is_neither' is 0.0!
The probability of event arising from 'is_heads' is 0.5!
The probability of event arising from 'is_tails' is 0.5!


In [19]:
# Biased coin
weighted_sample_space = {'Heads': 4, 'Tails': 1}

sample_space_size = sum(weighted_sample_space.values())

## Working with Biased Probabilities

Sometimes events don't have equal probabilities. For example, a biased coin might be more likely to land on heads than tails. Let's see how to handle such cases using weighted sample spaces.

In [20]:
event = get_matching_events(is_heads_or_tails, weighted_sample_space)
event_size = sum(weighted_sample_space[outcome] for outcome in event)
assert event_size == 5

In [21]:
def compute_event_probability(event_condition, generic_sample_space):
    event = get_matching_events(event_condition, generic_sample_space)
    event_size = sum(generic_sample_space[outcome] for outcome in event)
    tot_sample_size = sum(generic_sample_space.values())
    return event_size/tot_sample_size

In [23]:
for event_condition in event_conditions:
    condition_name = event_condition.__name__
    prob = compute_event_probability(event_condition, weighted_sample_space)
    print(f"The probability of event arising from '{condition_name}' is {prob}!")

The probability of event arising from 'is_heads_or_tails' is 1.0!
The probability of event arising from 'is_neither' is 0.0!
The probability of event arising from 'is_heads' is 0.8!
The probability of event arising from 'is_tails' is 0.2!


In [27]:
## Problem 1:
## Analyzing a family of four chilren ##

# constructing the weighted sample space
possible_children = {'Boy', 'Girl'}
sample_space = set()
for child1 in possible_children:
    for child2 in possible_children:
        for child3 in possible_children:
            for child4 in possible_children:
                outcome = (child1, child2, child3, child4) 
                #if we only care about the genders, we can make this a frozen set
                # but we can keep the ordered tuples like this, if we want to ask question 
                # regarding the age hierarchy of the kids
                sample_space.add(outcome)


## Real-World Problem: Family Composition Analysis

Let's analyze the probability distribution of different gender combinations in a family with four children. This demonstrates how to:
1. Create complex sample spaces using nested loops
2. Work with tuples as outcomes
3. Calculate specific probabilities for family compositions

In [29]:
sample_space

{('Boy', 'Boy', 'Boy', 'Boy'),
 ('Boy', 'Boy', 'Boy', 'Girl'),
 ('Boy', 'Boy', 'Girl', 'Boy'),
 ('Boy', 'Boy', 'Girl', 'Girl'),
 ('Boy', 'Girl', 'Boy', 'Boy'),
 ('Boy', 'Girl', 'Boy', 'Girl'),
 ('Boy', 'Girl', 'Girl', 'Boy'),
 ('Boy', 'Girl', 'Girl', 'Girl'),
 ('Girl', 'Boy', 'Boy', 'Boy'),
 ('Girl', 'Boy', 'Boy', 'Girl'),
 ('Girl', 'Boy', 'Girl', 'Boy'),
 ('Girl', 'Boy', 'Girl', 'Girl'),
 ('Girl', 'Girl', 'Boy', 'Boy'),
 ('Girl', 'Girl', 'Boy', 'Girl'),
 ('Girl', 'Girl', 'Girl', 'Boy'),
 ('Girl', 'Girl', 'Girl', 'Girl')}

For loops like this to construct combinatorial sets is not efficient, we can use python itertools.product

In [38]:
from itertools import product
all_combinations = product(*(4* [possible_children]))
for combo in all_combinations:
    print(combo)


('Boy', 'Boy', 'Boy', 'Boy')
('Boy', 'Boy', 'Boy', 'Girl')
('Boy', 'Boy', 'Girl', 'Boy')
('Boy', 'Boy', 'Girl', 'Girl')
('Boy', 'Girl', 'Boy', 'Boy')
('Boy', 'Girl', 'Boy', 'Girl')
('Boy', 'Girl', 'Girl', 'Boy')
('Boy', 'Girl', 'Girl', 'Girl')
('Girl', 'Boy', 'Boy', 'Boy')
('Girl', 'Boy', 'Boy', 'Girl')
('Girl', 'Boy', 'Girl', 'Boy')
('Girl', 'Boy', 'Girl', 'Girl')
('Girl', 'Girl', 'Boy', 'Boy')
('Girl', 'Girl', 'Boy', 'Girl')
('Girl', 'Girl', 'Girl', 'Boy')
('Girl', 'Girl', 'Girl', 'Girl')


In [39]:
sample_space_efficient = set(product(possible_children, repeat=4))
sample_space_efficient

{('Boy', 'Boy', 'Boy', 'Boy'),
 ('Boy', 'Boy', 'Boy', 'Girl'),
 ('Boy', 'Boy', 'Girl', 'Boy'),
 ('Boy', 'Boy', 'Girl', 'Girl'),
 ('Boy', 'Girl', 'Boy', 'Boy'),
 ('Boy', 'Girl', 'Boy', 'Girl'),
 ('Boy', 'Girl', 'Girl', 'Boy'),
 ('Boy', 'Girl', 'Girl', 'Girl'),
 ('Girl', 'Boy', 'Boy', 'Boy'),
 ('Girl', 'Boy', 'Boy', 'Girl'),
 ('Girl', 'Boy', 'Girl', 'Boy'),
 ('Girl', 'Boy', 'Girl', 'Girl'),
 ('Girl', 'Girl', 'Boy', 'Boy'),
 ('Girl', 'Girl', 'Boy', 'Girl'),
 ('Girl', 'Girl', 'Girl', 'Boy'),
 ('Girl', 'Girl', 'Girl', 'Girl')}

In [41]:
eg_combo = ('Boy', 'Boy', 'Boy', 'Boy')
eg_combo.count('Boy')

4

In [42]:
# computing the probability of two boys 

# event condition
def has_two_boys(outcome): return outcome.count('Boy') == 2

event = get_matching_events(has_two_boys, sample_space_efficient)
prob = len(event)/len(sample_space_efficient)
print(f'Probability of two boys is {prob}')

Probability of two boys is 0.375


In [44]:
# Problem 2:
## Analyzing multiple die roll

possible_die_rolls = set(range(1, 7))  
possible_die_rolls

{1, 2, 3, 4, 5, 6}

## Multiple Die Roll Analysis

Let's explore probabilities associated with rolling multiple dice. This section demonstrates:
1. Creating sample spaces for multiple die rolls
2. Computing probabilities of sums
3. Working with larger sample spaces efficiently

In [45]:
# constructing the sampele space of 6 die rolls

sample_space_6_rolls = set(product(possible_die_rolls, repeat = 6))
sample_space_6_rolls

{(3, 1, 4, 6, 4, 3),
 (6, 1, 3, 6, 3, 3),
 (2, 4, 2, 6, 4, 3),
 (2, 6, 4, 6, 3, 6),
 (3, 1, 1, 2, 2, 3),
 (2, 2, 5, 5, 5, 5),
 (1, 3, 6, 3, 6, 5),
 (6, 2, 3, 3, 2, 6),
 (6, 1, 4, 1, 2, 5),
 (2, 5, 2, 3, 3, 6),
 (6, 2, 3, 2, 3, 2),
 (2, 3, 3, 6, 3, 1),
 (1, 3, 6, 1, 3, 6),
 (1, 6, 1, 6, 4, 4),
 (4, 4, 4, 5, 5, 1),
 (3, 4, 2, 1, 6, 6),
 (3, 1, 6, 5, 2, 6),
 (3, 1, 6, 4, 3, 2),
 (3, 5, 3, 3, 2, 3),
 (3, 6, 5, 3, 3, 4),
 (6, 4, 4, 5, 5, 2),
 (4, 5, 6, 1, 6, 4),
 (5, 1, 3, 6, 4, 2),
 (3, 6, 3, 5, 5, 6),
 (3, 4, 2, 6, 6, 1),
 (6, 3, 5, 3, 1, 1),
 (1, 2, 6, 2, 2, 2),
 (1, 4, 2, 6, 5, 3),
 (6, 5, 4, 1, 5, 1),
 (2, 3, 5, 1, 3, 3),
 (1, 1, 1, 1, 6, 3),
 (1, 2, 2, 4, 1, 4),
 (3, 6, 4, 5, 6, 6),
 (6, 5, 3, 1, 5, 3),
 (4, 2, 5, 2, 1, 2),
 (1, 5, 3, 2, 6, 5),
 (2, 4, 1, 5, 3, 6),
 (2, 1, 2, 3, 5, 5),
 (3, 2, 4, 4, 5, 4),
 (2, 6, 1, 1, 6, 3),
 (5, 4, 5, 5, 6, 2),
 (2, 3, 4, 3, 6, 5),
 (6, 6, 6, 2, 2, 3),
 (3, 5, 4, 5, 1, 1),
 (4, 2, 4, 4, 4, 4),
 (4, 4, 5, 2, 3, 6),
 (5, 6, 5, 1, 3, 6),
 (5, 1, 2, 5,

In [53]:
print(len(sample_space_6_rolls))

46656


In [46]:
# getting an event condition where the outcome roll adds to a certain no.
# let's construct a generic function to construct specific functions
def adds_to_n(n):
    def adds_to(outcome):
        return (True 
                if sum(outcome) == n 
                else
                False) 
    return adds_to

adds_to_21 = adds_to_n(21)

In [49]:
event  = get_matching_events(adds_to_21, sample_space_6_rolls)
prob = len(event)/ len(sample_space_6_rolls)
print(f'Probability of 6 die rolls adding to 21 is {prob}')

Probability of 6 die rolls adding to 21 is 0.09284979423868313


In [56]:
# Problem 3:
# Computing Die roll probabilities using weighted sample spaces

from collections import defaultdict
weighted_sample_space = defaultdict(int) #constructs a dictionary where the default values are by design 0
for outcome in sample_space_6_rolls:
    total = sum(outcome)
    weighted_sample_space[total] += 1

weighted_sample_space


defaultdict(int,
            {21: 4332,
             22: 4221,
             27: 1666,
             12: 456,
             24: 3431,
             19: 3906,
             18: 3431,
             20: 4221,
             23: 3906,
             26: 2247,
             28: 1161,
             15: 1666,
             17: 2856,
             13: 756,
             14: 1161,
             30: 456,
             16: 2247,
             25: 2856,
             9: 56,
             32: 126,
             31: 252,
             29: 756,
             11: 252,
             33: 56,
             10: 126,
             7: 6,
             35: 6,
             8: 21,
             34: 21,
             6: 1,
             36: 1})

### Optimizing Die Roll Calculations

Instead of working with individual outcomes, we can optimize our calculations by:
1. Using a weighted sample space approach
2. Grouping outcomes by their sums
3. Using defaultdict for efficient counting

In [57]:
def is_in_interval(number, minimum, maximum):
    return minimum <= number <= maximum

In [58]:
prob = compute_event_probability(lambda x: is_in_interval(x, 10, 21), weighted_sample_space)

print(f'Probability of interval (10, 21) is {prob}')


Probability of interval (10, 21) is 0.5446244855967078


In [60]:
# how likely is it to flip 8 heads out of 10 coin flips if the coin is fair
possible_flips = {'Heads', "Tails"}
sample_space_10_flips = set(product(possible_flips, repeat = 10))
sample_space_10_flips

{('Tails',
  'Heads',
  'Tails',
  'Tails',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Heads',
  'Heads'),
 ('Tails',
  'Tails',
  'Tails',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Heads',
  'Tails',
  'Heads'),
 ('Tails',
  'Heads',
  'Heads',
  'Heads',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Heads'),
 ('Heads',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Heads'),
 ('Tails',
  'Heads',
  'Heads',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Heads',
  'Tails',
  'Heads'),
 ('Heads',
  'Tails',
  'Heads',
  'Heads',
  'Heads',
  'Heads',
  'Tails',
  'Tails',
  'Tails',
  'Tails'),
 ('Tails',
  'Tails',
  'Heads',
  'Tails',
  'Heads',
  'Heads',
  'Heads',
  'Tails',
  'Heads',
  'Tails'),
 ('Heads',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Tails',
  'Tails',
  'Heads',
  'Tails',
  'Tails'),
 ('Heads',
  'Tails',
  'Tails',
  'Heads',
  'Heads',
  'Tails',
  'Tails',
  'Tails',
  'Heads',
  'Tails'),
 

## Coin Flip Sequence Analysis

In this section, we'll analyze sequences of coin flips to:
1. Calculate the probability of specific sequences
2. Work with multiple flips efficiently
3. Demonstrate how to handle complex event conditions

In [62]:
# event with 10 head

def has_n_heads(n): return lambda x: x.count('Heads') == n

event = get_matching_events(has_n_heads(8), sample_space_10_flips)
prob = len(event)/len(sample_space_10_flips)
print(f'Probability of landing 8 heads out 10 coin flips is {prob}')

Probability of landing 8 heads out 10 coin flips is 0.0439453125


In [63]:
## A more efficient implementation
## let's construct the weighted sample space with n number of heads

weighted_sample_space_10flips = defaultdict(int)
for outcome in sample_space_10_flips:
    n_heads = outcome.count('Heads')
    weighted_sample_space_10flips[n_heads] += 1
print(weighted_sample_space_10flips)

defaultdict(<class 'int'>, {6: 210, 4: 210, 7: 120, 5: 252, 8: 45, 3: 120, 1: 10, 2: 45, 0: 1, 9: 10, 10: 1})


### Optimization Using Weighted Sample Spaces

To make our coin flip calculations more efficient, we'll:
1. Use a weighted sample space approach
2. Count outcomes by number of heads
3. Demonstrate how this improves computational efficiency

In [65]:
event_size = weighted_sample_space_10flips[8]
prob = event_size/sum(weighted_sample_space_10flips.values())
print(f'Probability of landing 8 heads out of 10 flips is {prob}')

Probability of landing 8 heads out of 10 flips is 0.0439453125


In [67]:
prob = compute_event_probability(lambda x: not is_in_interval(x, 3, 7), weighted_sample_space_10flips)
print(f"Probability of observing more than 7 heads or 7 tails is {prob}")

Probability of observing more than 7 heads or 7 tails is 0.109375
