#### Use counting
We can apply the naive definition of probability, with the numerator being the number of ways that Bob can match Alice on 4 out of 5 adjectives, plus the number of ways that Bob can match Alice on 5 out of 5 adjectives. The denominator is just the number of possible adjective combinations that Bob can recieve. [ref](https://www.quora.com/On-a-dating-site-users-can-select-5-out-of-24-adjectives-to-describe-themselves-A-match-is-declared-between-two-users-if-they-match-on-at-least-4-adjectives-If-Alice-and-Bob-randomly-pick-adjectives-what-is-the-probability-that-they-will-form-a-match)

In [2]:
# binomial: incorrect
p = (5 / 24) ** 2
p ** 4 * (1 - p) ** 1 + p ** 5

3.5487066552944635e-06

In [3]:
# correct {(5C4 * 19C1) + 5C5} / 24C5
4 / 1771

0.002258610954263128

#### Simulation

In [4]:
import random

n_trials = 100000
success = 0
for _ in range(n_trials):
    alice = random.sample(range(1, 24), 5)
    bob = random.sample(range(1, 24), 5)

    if len(set(alice).intersection(set(bob))) >= 4:
        success += 1
        
success / n_trials

0.00286

#### Why Binomial does not apply
Here we are sampling without replacement, with small population (size 24) to choose from. As population gets larger, however, binomial assumption is reasonable.

In [5]:
# binomial 
p = (5 / 1000) ** 2
p ** 4 * (1 - p) ** 1 + p ** 5

3.906250000000001e-19

In [6]:
n_trials = 100000
success = 0
for _ in range(n_trials):
    alice = random.sample(range(1, 1000), 5)
    bob = random.sample(range(1, 1000), 5)

    if len(set(alice).intersection(set(bob))) >= 4:
        success += 1
        
success / n_trials

0.0