## The birthday problem


This is a classic problem in probability theory.  David Morin's book <i> Probability for the enthusiastic beginner</i> gives a nice overview and solution of the problem, although one should always try to solve the problem before checking out the solution.

This notebook takes his advice and performs many iterations of the scenario and calculates the percent time that birthdays are duplicated at least one.

#### Problem: 
How many people need to be in a room in order for there to be a greater than 1/2 probability that at least two of them have the same birthday? By "same birthday" we mean the same day of the year; the year may differ.  Ignore leap years.

In [None]:
import numpy as np

In [None]:
# set the number of iterations
n_rounds = 100000

# set the number of people in the room that get surveyed
n_people = 23

### Single iteration example

In [None]:
# create a uniform random sample of n_people from 365 days
birthdays = np.random.randint(1,366,n_people)

# get all the unique days
birthdays_unique = np.unique(birthdays)

In [None]:
# print the length of the unique days
print(f'Total people: {n_people}')
print(f'Total unique birthdays: {len(birthdays_unique)}\n')
if len(birthdays_unique) < n_people:
    print('>> There is a repeated birthday <<')
else:
    print('>> There is NOT a repeated birthday <<')


### Perform many rounds of experiment

In [None]:
results = []
for k in range(n_rounds):
    # get sample of birthdays for n_people
    birthdays = np.random.randint(1,366,n_people)
    
    # if the length of the unique birthday list is smaller than n_people, then at least one birthday is repeated
    if len(np.unique(birthdays)) < n_people:
        results.append(1)
    else:
        results.append(0)
        
repeated_portion = sum(results)/n_rounds

In [None]:
print(f'After {n_rounds} rounds, each containing {n_people} per round:')
print(f'at least one birthday was repeated in {repeated_portion*100}% of the rounds')