# Birthday Paradox

The Birthday Paradox, also called the Birthday Problem, is the surprisingly high probability that two people will have the same birthday even in a small group of people. In a group of 70 people, there’s a 99.9 percent chance of two people having a matching birthday. But even in a group as small as 23 people, there’s a 50 percent chance of a matching birthday. This program performs several probability experiments to determine the percentages for groups of different sizes. We call these types of experiments, in which we conduct multiple random trials to understand the likely outcomes, Monte Carlo experiments.

You can find out more about the Birthday Paradox at https://en.wikipedia.org/wiki/Birthday_problem.

## The Program in Action

When you run `birthdayparadox.py`, the output will look like this:
```
Birthday Paradox, by Al Sweigart al@inventwithpython.com
--snip--
How many birthdays shall I generate? (Max 100)
> 23
Here are 23 birthdays:
Oct 9, Sep 1, May 28, Jul 29, Feb 17, Jan 8, Aug 18, Feb 19, Dec 1, Jan 22,
May 16, Sep 25, Oct 6, May 6, May 26, Oct 11, Dec 19, Jun 28, Jul 29, Dec 6,
Nov 26, Aug 18, Mar 18
In this simulation, multiple people have a birthday on Jul 29
Generating 23 random birthdays 100,000 times...
Press Enter to begin...
Let's run another 100,000 simulations.
0 simulations run...
10000 simulations run...
--snip--
90000 simulations run...
100000 simulations run.
Out of 100,000 simulations of 23 people, there was a
matching birthday in that group 50955 times. This means
that 23 people have a 50.95 % chance of
having a matching birthday in their group.
That's probably more than you would think!
```

## Example

In [28]:
import datetime
from random import randint

ordinal = 737430
d = datetime.date.fromordinal(ordinal)
d_str = d.strftime('%b %#d')
print(d_str)
print(ordinal)

today = datetime.date.today()
today_ordinal = today.toordinal()
today_str = today.strftime('%b %#d')
print(today_str)
print(today_ordinal)

num = randint(1, datetime.date.today().toordinal())
print(num)


Jan 6
737430
Jul 29
739096
138344


In [39]:
import datetime
from random import choices

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

test = choices(range(1, datetime.date.today().toordinal()), k=num)

print(test)

How many birthdays shall I generate? (Max 100)
[406386, 362618, 385715, 422003, 677069]


In [47]:
import datetime
from random import randint

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

today = datetime.date.today().toordinal()

l = []
for i in range(num):
    birthday_ord = randint(1, today)
    birthday_str = datetime.date.fromordinal(birthday_ord).strftime('%b %#d')
    l.append(birthday_str)

print(", ".join(l))

How many birthdays shall I generate? (Max 100)
Aug 16, Apr 18, Jan 1, Mar 18, May 8


In [2]:
import datetime
from random import choices

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

test = choices(range(1, datetime.date.today().toordinal()), k=num)

print(', '.join([datetime.date.fromordinal(i).strftime('%b %#d') for i in test]))

How many birthdays shall I generate? (Max 100)
Feb 3, Feb 8, Aug 21, Nov 6, Jul 16, Nov 17, Dec 20, Dec 25, Sep 9, May 9, Jul 17, Feb 2, Apr 16, Dec 9, Mar 13, Oct 21, Jul 4, Jan 29, Mar 15, Oct 29, Dec 11, Apr 24, Feb 24, Jul 31, Aug 22


In [99]:
import datetime
from random import randint, seed

seed(153453)

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

today = datetime.date.today().toordinal()

l = []
for i in range(num):
    birthday_ord = randint(1, today)
    birthday_str = datetime.date.fromordinal(birthday_ord).strftime('%b %#d')
    l.append(birthday_str)

print(", ".join(l))
seen = set()
dupes = [x for x in l if x in seen or seen.add(x)] 
print(dupes)

result = 0

print(f'{0} simulations run...')
for j in range(100_000):
    l = []
    for i in range(num):
        birthday_ord = randint(1, today)
        birthday_str = datetime.date.fromordinal(birthday_ord).strftime('%b %#d')
        l.append(birthday_str)

    seen = set()
    dupes = []
    dupes = [x for x in l if x in seen or seen.add(x)] 
    
    if len(dupes) > 0:
        result += 1
    
    if (j+1) % 10_000 == 0:
        print(f'{j+1} simulations run...')

print(result)

How many birthdays shall I generate? (Max 100)
Jun 13, Nov 5, Sep 5, Aug 10, Apr 12, Jun 13, Sep 10, Dec 7, Jan 21, Jan 7, Jun 12, Oct 30, Apr 21, Apr 25, Oct 1, Feb 24, Jan 20, May 3, Oct 21, Feb 12, May 5, Dec 24, Apr 26
['Jun 13']
0 simulations run...
10000 simulations run...
20000 simulations run...
30000 simulations run...
40000 simulations run...
50000 simulations run...
60000 simulations run...
70000 simulations run...
80000 simulations run...
90000 simulations run...
100000 simulations run...
50924


In [16]:
import datetime
from random import randint

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

today = datetime.date.today().toordinal()

birthdays = []
for i in range(num):
    birthday_ord = randint(1, today)
    birthday_str = datetime.date.fromordinal(birthday_ord).strftime('%b %#d')
    birthdays.append(birthday_str)

print(", ".join(birthdays))
seen = set()
dupes = [x for x in birthdays if x in seen or seen.add(x)] 
print(dupes)

result = 0

print(f'{0} simulations run...')
for j in range(100_000):
    birthdays = []
    for i in range(num):
        birthday = randint(1, 366)
        birthdays.append(birthday)
        
    if len(birthdays) != len(set(birthdays)):
        result += 1

    if (j+1) % 10_000 == 0:
        print(f'{j+1} simulations run...')

print(result)

How many birthdays shall I generate? (Max 100)
Mar 26, Aug 9, Dec 15, Jul 1, Jun 20, Apr 18, Jan 13, Jun 27, Dec 15, Jan 7, Apr 5, Sep 15, Jun 10, May 5, Jul 15, Oct 26, Jan 14, Apr 16, Nov 18, Dec 23, Oct 2, Feb 25, May 1
['Dec 15']
0 simulations run...
10000 simulations run...
20000 simulations run...
30000 simulations run...
40000 simulations run...
50000 simulations run...
60000 simulations run...
70000 simulations run...
80000 simulations run...
90000 simulations run...
100000 simulations run...
50683


In [24]:
import datetime
from random import choices

print('How many birthdays shall I generate? (Max 100)')
num = int(input('> '))

today = datetime.date.today().toordinal()
birthdays = [datetime.date.fromordinal(i).strftime('%b %#d') for i in choices(range(1, today), k=num)]
print(', '.join(birthdays))

seen = set()
dupes = [x for x in birthdays if x in seen or seen.add(x)] 
print(dupes)

result = 0
print(f'{0} simulations run...')
for j in range(100_000):
    birthdays = choices(range(1,366),k=num)
        
    if len(birthdays) != len(set(birthdays)):
        result += 1

    if (j+1) % 10_000 == 0:
        print(f'{j+1} simulations run...')

print(result)

How many birthdays shall I generate? (Max 100)
Feb 10, May 4, Mar 30, Jan 16, Jan 23, Dec 24, Dec 16, May 3, Sep 7, Jul 2, Jun 15, Jun 22, Mar 28, Aug 24, Mar 11, Jun 24, May 31, Feb 11, May 4, Apr 8, Jul 13, Mar 29, Dec 1, Jul 23, Nov 29, Mar 20, Jun 17, Jan 1, Jan 7, Jan 14, Sep 17, Sep 1, Sep 15, Feb 13, Feb 18, Jul 19, Jan 4, Jun 11, Apr 1, Apr 1, Apr 1, Nov 25, Dec 20, Sep 20, Mar 28, May 25, Feb 21, Jan 23, Nov 19, Mar 15
['May 4', 'Apr 1', 'Apr 1', 'Mar 28', 'Jan 23']
0 simulations run...
10000 simulations run...
20000 simulations run...
30000 simulations run...
40000 simulations run...
50000 simulations run...
60000 simulations run...
70000 simulations run...
80000 simulations run...
90000 simulations run...
100000 simulations run...
96951


In [111]:
import datetime, random


def getBirthdays(numberOfBirthdays):
    """Returns a list of number random date objects for birthdays."""
    birthdays = []
    for i in range(numberOfBirthdays):
        # The year is unimportant for our simulation, as long as all
        # birthdays have the same year.
        startOfYear = datetime.date(2001, 1, 1)

        # Get a random day into the year:
        randomNumberOfDays = datetime.timedelta(random.randint(0, 364))
        birthday = startOfYear + randomNumberOfDays
        birthdays.append(birthday)
    return birthdays


def getMatch(birthdays):
    """Returns the date object of a birthday that occurs more than once
    in the birthdays list."""
    if len(birthdays) == len(set(birthdays)):
        return None  # All birthdays are unique, so return None.

    # Compare each birthday to every other birthday:
    for a, birthdayA in enumerate(birthdays):
        for b, birthdayB in enumerate(birthdays[a + 1 :]):
            if birthdayA == birthdayB:
                return birthdayA  # Return the matching birthday.


# Display the intro:
print('''Birthday Paradox, by Al Sweigart al@inventwithpython.com

The birthday paradox shows us that in a group of N people, the odds
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)
''')

# Set up a tuple of month names in order:
MONTHS = ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
          'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec')

while True:  # Keep asking until the user enters a valid amount.
    print('How many birthdays shall I generate? (Max 100)')
    response = input('> ')
    if response.isdecimal() and (0 < int(response) <= 100):
        numBDays = int(response)
        break  # User has entered a valid amount.
print()

# Generate and display the birthdays:
print('Here are', numBDays, 'birthdays:')
birthdays = getBirthdays(numBDays)
for i, birthday in enumerate(birthdays):
    if i != 0:
        # Display a comma for each birthday after the first birthday.
        print(', ', end='')
    monthName = MONTHS[birthday.month - 1]
    dateText = '{} {}'.format(monthName, birthday.day)
    print(dateText, end='')
print()
print()

# Determine if there are two birthdays that match.
match = getMatch(birthdays)

# Display the results:
print('In this simulation, ', end='')
if match != None:
    monthName = MONTHS[match.month - 1]
    dateText = '{} {}'.format(monthName, match.day)
    print('multiple people have a birthday on', dateText)
else:
    print('there are no matching birthdays.')
print()

# Run through 100,000 simulations:
print('Generating', numBDays, 'random birthdays 100,000 times...')
input('Press Enter to begin...')

print('Let\'s run another 100,000 simulations.')
simMatch = 0  # How many simulations had matching birthdays in them.
for i in range(100000):
    # Report on the progress every 10,000 simulations:
    if i % 10000 == 0:
        print(i, 'simulations run...')
    birthdays = getBirthdays(numBDays)
    if getMatch(birthdays) != None:
        simMatch = simMatch + 1
print('100,000 simulations run.')

# Display simulation results:
probability = round(simMatch / 100000 * 100, 2)
print('Out of 100,000 simulations of', numBDays, 'people, there was a')
print('matching birthday in that group', simMatch, 'times. This means')
print('that', numBDays, 'people have a', probability, '% chance of')
print('having a matching birthday in their group.')
print('That\'s probably more than you would think!')

Birthday Paradox, by Al Sweigart al@inventwithpython.com

The birthday paradox shows us that in a group of N people, the odds
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)

How many birthdays shall I generate? (Max 100)



Here are 50 birthdays:
Apr 26, Feb 8, Feb 24, Oct 18, Oct 24, Aug 19, Oct 8, Dec 22, Nov 3, Jul 11, Jun 13, Nov 12, Feb 7, Dec 1, May 1, Dec 29, Oct 10, Feb 23, Aug 12, Nov 21, May 11, Jan 19, Jul 5, Sep 12, Apr 6, Oct 8, May 6, Aug 14, Feb 22, Dec 19, May 26, Feb 11, Aug 4, Oct 3, Nov 16, Nov 3, May 7, Oct 9, Jan 6, May 20, Aug 27, Apr 12, Jun 3, Oct 17, Apr 16, Nov 19, May 4, Nov 4, Dec 13, Aug 23

In this simulation, multiple people have a birthday on Oct 8

Generating 50 random birthdays 100,000 times...
Let's run another 100,000 simulations.
0 simulations run...
10000 simulations run...
20000 simulations run...
30000 simulations run...
40000 simulations run...
50000 simulations run...
60000 simulations run...
70000 simulations run...
80000 simulations run...
90000 simulations run...
100,000 simulations run.
Out of 100,000 simulations of 50 people, there was a
matching birthday in that group 97083 times. This means
that 50 people have a 97.08 % chance of
having a matching birthday

In [25]:
import datetime
from random import choices

print('''Birthday Paradox, by Al Sweigart al@inventwithpython.com

The birthday paradox shows us that in a group of N people, the odds
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)
''')

print('How many birthdays shall I generate? (Max 100)')
print()

num = int(input('> '))
print()

today = datetime.date.today().toordinal()
birthdays = [datetime.date.fromordinal(i).strftime('%b %#d') for i in choices(range(1, today), k=num)]
print(f'Here are {num} birthdays:')
print(', '.join(birthdays))
print()

seen = set()
dupes = [x for x in birthdays if x in seen or seen.add(x)] 
print('In this simulation, multiple people have a birthday on the following dates:', ', '.join(dupes))
print()

print('Generating', num, 'random birthdays 100,000 times...')
input('Press Enter to begin...')
print('Let\'s run another 100,000 simulations.')

result = 0
print(f'{0} simulations run...')
for j in range(100_000):
    birthdays = choices(range(1,366),k=num)
        
    if len(birthdays) != len(set(birthdays)):
        result += 1

    if (j+1) % 10_000 == 0:
        print(f'{j+1} simulations run...')

probability = round(result / 100_000 * 100, 2)
print('Out of 100,000 simulations of', num, 'people, there was a')
print('matching birthday in that group', result, 'times. This means')
print('that', num, 'people have a', probability, '% chance of')
print('having a matching birthday in their group.')
print('That\'s probably more than you would think!')

Birthday Paradox, by Al Sweigart al@inventwithpython.com

The birthday paradox shows us that in a group of N people, the odds
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)

How many birthdays shall I generate? (Max 100)


Here are 50 birthdays:
May 2, Jan 28, Dec 21, Dec 11, Sep 19, Jul 29, Oct 31, Dec 16, Oct 6, Sep 28, Sep 10, Jul 29, Apr 29, Oct 6, Aug 3, May 5, Dec 31, Jan 3, Sep 23, May 27, Feb 5, Sep 12, May 21, May 17, May 25, Jul 17, Feb 8, Jul 19, Feb 6, Jun 19, Jun 13, Mar 13, Sep 8, Dec 21, Jul 1, Jun 24, Mar 10, Nov 16, Jun 21, Jul 19, May 9, Aug 18, Oct 28, Jun 3, Feb 1, Aug 31, Jun 17, May 19, May 4, Jul 12

In this simulation, multiple people have a birthday on the following dates: Jul 29, Oct 6, Dec 21, Jul 19

Generating 50 random birthdays 100,000 times...
Let's run another 100,000 s