# Birthday Paradox Exercise
This exercise comes from the [DSFM code-bank](https://github.com/dsfm-org/code-bank/blob/master/exercises/simulation-birthdays/simulation-birthdays.ipynb) repository of code. It has been replicated (and slightly adapted) here for ease of use in the Python workshop.

Eventually you will learn many advanced Python tools -- such as numpy, scipy, Pandas, scikit-learn, and other specialized libraries. But even beginners can use pure Python to answer some very interesting questions. In this Exercise you will need to write a pure Python program to answer the following question.

## Question
```
On average, what is the maximum number of randomly-selected people that can go into a room, 
such that no one in the room will share the same day of the year for their birthday with anyone else in the room?
```

## Solution

In [None]:
# Hint: You need the following module to generate random integers
from random import randint   

# Collect results
results = []

# Run experiments
for experiment in range(10000):
    
    room = set()
    
    # CODE HERE
    # Create a while loop that continues as long as no two people have the same birthday

    results.append(len(room)) 
    
print('Average number of people in the room: ', round(sum(results)/len(results), 4))

## Plotting Results
#### External Libraries
Here, we will use an *external library* called matplotlib. External libraries can be used just like the Python standard library modules, except that they need to be installed separately. Some Python distributions such as Anaconda already come with a great deal of external libraries for scientific computing. It is also possible to install libraries using the `pip` command.

#### Example: Plotting a Histogram
The question asks us to give an answer "on average". You therefore should simulate many experiments. Any one experiment will give you just one example of what to expect, and it might be higher or lower than an answer "on average". Plot a histogram of your experimental results.

In [None]:
# Hint: You need the following module to plot histograms
import matplotlib.pyplot as plt

### Your solution

plt.show()

## Follow-up Question - How about the Birthday Paradox on Different Planets?!
So far, we have assumed that a year has 365 days and thus that the probability of any birth day is $1/365$.

However, not all planets have 365 days! What about all the other earth-like exoplanets out there? Their inhabitants certainly would be curious about their birthday paradox.

Below, we will compute the average maximum number of people (as before) for different year lengths. First, let's start by encapsulating our previous logic into a function:

In [None]:
import numpy as np
from random import randint

def compute_average(nr_days_in_year, nr_runs=1000):
    """ Performs a simulation (as before) for a given number of days in the year
    """
    pass

Let's call this function for different number of days in the year, and store the corresponding results:

In [None]:
all_nr_days = range(1, 1001, 10)
average_numbers = []

for nr_days in all_nr_days:
    ### compute the average for this number of days, and append to the list of averages

And let's plot how the average maximum number of people evolves with the number of days in the year:

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(6, 4))
plt.plot(list(all_nr_days), average_numbers, lw=3)
plt.xlabel('Number of days in the year')
plt.ylabel('Average max number of people with distinct birthdays')

You can try experimenting with 1,000, 3,000 and even 10,000 days. What do you observe?

#### Follow-up Question1:
In addition to plotting the average number of people, you can try also plotting the standard deviation.

#### Follow-up Question 2:
Can you find a good fit for the function above (giving the average number of people given the number of days)?
*Hint*: `y = x**0.5` is a good start