# 2. Birthday paradox
The Birthday Paradox, also known as the Birthday Problem, is the surpisingly high probability that two people will have the same birthday even in a small group of people.<br>
In a group of 70 people, there's a 99.9% chance of two people having a matching birthday.<br>
But even in a group as small as 23 people, there's a 50% chanmce of a matching birthday.<br>
This program performs several probability experiments to detemine the percentages for groups of different sizes.<br>
We call these types of experiments, in which we conduct multiple random trials to understand the likely outcomes, `Monte Carlo` experiments.<br>

You can find out more about the Birthday Paradox at https://en.wikipedia.org/wiki/Birthday_problem.


## The Program in Action
When you run *birthdayparadox.py*, the output will look like the following:

```
How many birthdays shall I generate? (Max 100)
> 23

Here are 23 birthdays:
Oct 9, Sep 1, May 28, Jul, 29, Feb 17, Jan 8, Aug 18, Feb 19, Dec 1, Jan 22, May 16, Sep 25, Oct 6, May 6, May 26, Oct 11, Dec 19, Jun 28, Jul 29, Dec 6, Nov 26, Aug 18, Mar 18

In this simulation, multiple people have a birthday on Jl 29

Generating 23 random birthdays 100,000 times...
Press Enter to begin...
Let's run another 100,000 simulationms.
0 simulations run...
10000 simulations run...
---snip---
90000 simulations run...
100000 simulations run.
Out of 100,000 simulations of 23 people, there was a 
matching birthday in that group 50955 times. This means 
that 23 people have a 50.95% chance of 
having a matching birthday in their group.
That's probably more than you would think!
```

## How it Works
Running 100,000 simulations can take a while, which is why lines 95 and 96 report that another 10,000 simulations have finished.<br>
This feedback can assure the user that the program has not frozen.<br>
Notice that some of the integers, like 10_000 on line 95 nad 100_000 on lines 93 and 103, have underscores.<br>
These underscores have no special meaning, but Python allows them so that programmers can make integer values easier to read.<br>
In other words, it's easier to read "one hundred thousand" from 100_000 than from 100000.

In [2]:
import datetime, random

def getBirthdays(numberOfBirthdays):
    # Returns a list of number random date objects for birthdays.

    birthdays = []
    for i in range(numberOfBirthdays):
        # The year is unimportant for the simulation, as long as all 
        # birthdays have the same year.
        startOfYear = datetime.date(2001, 1, 1)

        # Get a random dat into the year:
        randomNumberOfDays = datetime.timedelta(random.randint(0,364))
        birthday = startOfYear + randomNumberOfDays
        birthdays.append(birthday)
    return birthdays

def getMatch(birthdays):
    # Returns the date object of a birthday that occurs more than once 
    # in the birthdays list.

    if len(birthdays) == len(set(birthdays)):
        return None # All birthdays are unique, so return None.

    # Compare each birthday to every other birthday:
    for a, birthdayA in enumerate(birthdays):
        for b, birthdayB in enumerate(birthdays[a + 1 :]):
            if birthdayA == birthdayB:
                return birthdayA # Return the matching birthday.

# Display the intro:
print(f'''
The Birthday Paradox shoes is that in a group of N people, the odds 
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)
''')

# Set up a tuple of month names in order:
MONTHS = ('Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 
          'Jul','Aug', 'Sep', 'Oct', 'Nov', 'Dec')

while True: # Keep asking until the user enters a valid amount.
    response = input('How many birthdays shall I generate? (Max 100)')
    if response.isdecimal() and (0 < int(response) <= 100):
        numBDays = int(response)
        break # user has entered a valid amount.
print()

# Generate and display the birthdays:
print(f'Here are {numBDays} birthdays')
birthdays = getBirthdays(numBDays)
for i, birthday in enumerate(birthdays):
    if i != 0:
        # Display a comma for each birthday after the first birthday.
        monthName = MONTHS[birthday.month - 1]
        print(f'{monthName} {birthday.day}', end = '')
        print(', ', end = '')
print()
print()

# Determine if there are two birthdays that match.
match = getMatch(birthdays)

# Display the results:
print('In this simulation, ', end='')
if match != None:
    monthName = MONTHS[match.month - 1]
    dateText = f'{monthName} {match.day}'
    print('multiple people have a birthday on', dateText)
else:
    print('there are no matching birthdays.')
print()

# Run through 100,000 simulations:
print(f'Generating {numBDays} random birthdays 100,000 times...')
input('Press ENTER to begin...')

print('Let\'s run another 100,000 simulations.')
simMatch = 0 # How many simulations had matching birthdays in them.
for i in range (100_000):
    # Report on the progress every 10,000 simulations:
    if i % 10_000 == 0:
        print (f' {i} simulations run...')
    birthdays = getBirthdays(numBDays)
    if getMatch(birthdays) != None:
        simMatch = simMatch + 1
print('100,000 simulations run.')

# Display simulation results:
probability = round(simMatch / 100_000 * 100, 2)
print(f'''Out of 100,000 simulations of {numBDays} people, there was a
matching birthday in that group {simMatch} times. This means
that {numBDays} people have a {probability}% chance of
having a matching birthday in their group.
That's probably more than you would think!'
''')


The Birthday Paradox shoes is that in a group of N people, the odds 
that two of them have matching birthdays is surprisingly large.
This program does a Monte Carlo simulation (that is, repeated random
simulations) to explore this concept.

(It's not actually a paradox, it's just a surprising result.)


Here are 15 birthdays
Nov 11, Nov 25, Apr 4, Dec 26, Dec 25, May 9, Jul 26, Oct 17, May 10, Oct 19, Jan 25, Dec 26, Sep 15, Apr 11, 

In this simulation, multiple people have a birthday on Dec 26

Generating 15 random birthdays 100,000 times...
Let's run another 100,000 simulations.
0 simulations run...
10000 simulations run...
20000 simulations run...
30000 simulations run...
40000 simulations run...
50000 simulations run...
60000 simulations run...
70000 simulations run...
80000 simulations run...
90000 simulations run...
100,000 simulations run.
Out of 100,000 simulations of 15 people, there was a
matching birthday in that group 25373 times. This means
that 15 people have a 25.37% c

## Exploring the Program
Try to find the answers to the following questions. <br>
Experiment with some modifications to the code and rerun the program to see what effect the changes have.

### 1. How are the birthdays represented in this program? (Hint: look at line 16.)
The birthdays are represented by a variable in the `datetime` format.


### 2. How could you remove the maximum limit of 100 birthdays the program generates?
By changing the end of the line of:
```python
if response.isdecimal() and (0 < int(response) <= 100):
```
from 100 to a greater number.


### 3. What error message do you get if you delete/comment out `numBDays = int(response)` on line 57?
By removing `numBDays = int(response)`, we get a `NameError`.<br>
The program cannot move past line 62 as the variable is referenced by there is no valid number used to generate the number of birthdays.

### 4. How can you make the program display full month names such as `'January'` instead of `'Jan'`?
We can display the full month names by changing the `MONTHS` variable from a tuple of `'Jan, 'Feb', 'Mar', etc.` to `'January', 'February', 'March', etc.`


### 5. How could you make `'X simulations run'` appear every 1000 simulations instead of every 10,000?
By changing the line:
```python
if i % 10_000 == 0:
```

to:
```python
if i % 1_000 == 0:
```

We can make the appear every 1000 simulations instead.