## Monte-Carlo Simulation for famous probability paradoxes

In [7]:
import pandas as pd
import numpy as np
import seaborn as sns
from random import shuffle

Probability is an abstract concept. It suggests some kind of repetition.

## Birthday paradox

23 persons are situated in one room. What is the probability that at least 2 of them will have a birtday on the same day, if we repeat the experiment multiple times? 

In [6]:
all_days = pd.Series(range(365))

In [7]:
all_days

0        0
1        1
2        2
3        3
4        4
      ... 
360    360
361    361
362    362
363    363
364    364
Length: 365, dtype: int64

In [8]:
# selecting 23 people so that days can be repeated
room = all_days.sample(23, replace= True)

In [9]:
room

311    311
256    256
99      99
57      57
252    252
140    140
92      92
256    256
104    104
79      79
124    124
316    316
198    198
80      80
215    215
226    226
289    289
119    119
98      98
337    337
272    272
33      33
258    258
dtype: int64

In [10]:
# Check if there are duplicates
room.duplicated()

311    False
256    False
99     False
57     False
252    False
140    False
92     False
256     True
104    False
79     False
124    False
316    False
198    False
80     False
215    False
226    False
289    False
119    False
98     False
337    False
272    False
33     False
258    False
dtype: bool

In [11]:
# If we want to squize the previous check to "True" or "False" we can use max()
room.duplicated().max()

True

In [12]:
all_days.sample(23, replace=True).duplicated().max()

False

In [13]:
# repeating experiment 10000 times
rooms = [all_days.sample(23, replace=True).duplicated().max() for _ in range(10000)]

In [15]:
rooms[:10]

[False, False, True, False, True, False, False, False, False, False]

In [17]:
# calculate portion where concurrent dates occured
np.mean(rooms)

0.5082

## Exam tickets case

Exam is organized in the following way. In case if any of the tickets is taken by a student, professor removes that ticket. Student is prepared for 20 tickets out of 30. What is the best strategy to go for the exam? Should student go first or second in order to select a ticket that he/she is prepared for?

In [3]:
tickets = list(range(1,31))

In [4]:
tickets

[1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30]

In [5]:
student = list(range(1,21))

In [6]:
student

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

In [14]:
# The shuffle() method takes a sequence, like a list, and reorganize the order of the items randomly.
# This method changes the original list, it does not return a new list.
shuffle(tickets)

In [15]:
tickets

[23,
 25,
 4,
 1,
 20,
 17,
 12,
 9,
 28,
 21,
 19,
 29,
 2,
 11,
 7,
 16,
 5,
 22,
 15,
 30,
 27,
 13,
 26,
 14,
 8,
 6,
 3,
 18,
 24,
 10]

In [16]:
# Checking if student knows answer to ticket №3
3 in student

True

In [17]:
# Running & Repeating the experiment 100000 times
n = 100000
student = list(range(1,21))
tickets = list(range(1,31))
result = []

for _ in range(n):
  shuffle(tickets)
  result.append(tickets[0] in student)

In [19]:
result[:10]

[True, True, True, True, False, True, True, True, False, False]

In [20]:
np.mean(result)

0.66569

In [21]:
n = 100000
student = list(range(1,21))
tickets = list(range(1,31))
result = []

for _ in range(n):
  shuffle(tickets)
  result.append(tickets[1] in student)

In [22]:
np.mean(result)

0.66568

In [23]:
n = 100000
student = list(range(1,21))
tickets = list(range(1,31))
result = []

for _ in range(n):
  shuffle(tickets)
  result.append(tickets[2] in student)

So, it does not matter if student will go for the exam first, second of third.