### Question: Given the following set of numbers {49, 8, 48, 15, 47, 4, 16, 23, 43, 44, 42, 45, 46}. A function picks a random subset of size 6, and returns the minimum of that subset. What is the expected value of this function? 

This can be done by __Monte Carlo simulation__ or going through all possible __combinations__ or even __analytically__.

## Monte Carlo simulation

First we can start by just doing a Monte Carlo simulation of the requested task: just repeat MC times the process of selecting 6 items and getting their minimum. The mean of all these minima should tend to the expected value:

In [1]:
%%time
import numpy as np
S = np.array([49, 8, 48, 15, 47, 4, 16, 23, 43, 44, 42, 45, 46]) # our set of numbers
MC = 10**6 # size of Monte Carlo simulation
exp_val = 0 # variable for the expected value
for i in range(MC):
    exp_val += np.min(np.random.choice(S,6,replace=False))
print (f'Expected value = {exp_val/MC}')

Expected value = 8.817975
Wall time: 22 s


## Combinations (itertools)

We can even go through all the possible combinations to get the exact expected value:

In [2]:
from itertools import combinations
comb = list(combinations(S, 6)) # list of combinations
mins = [min(cc) for cc in comb] # list of minima
print (f'Expected value = {sum(mins)/len(mins)}')

Expected value = 8.818181818181818


This is the extact expected value of the minimum

##  Probabilities

Now, we can use simple probabilities and algebra to proove it:

Let's say we sort the numbers in ascending order: S = $\{ 4,  8, 15, 16, 23, 42, 43, 44, 45, 46, 47, 48, 49\}$. We are sure that the minimum can't be one of the last five numbers. All the rest can, but with different probabilities. This involves combinations. The expected value is the weighted sum of the possible minima weighted by the their probabilities of occuring. Those weights are the binomial coefficients defined as $ C^p_n = \frac{n!}{p!(n-p)!}$, while selecting p from n. The expected value is then: 

\begin{equation*}
\text{Expected value} = \frac{\sum_{i=0}^{5}C^5_{12-i} S_i}{\sum_{i=5}^{12}C^5_i} = \frac{4 C^5_{12} + 8 C^5_{11} + 15 C^5_{10} + \dots + 43 C^5_{6} + 44 C^5_{5}}{\sum_{i=5}^{12}C^5_i} = \frac{15132}{1716} = 8.818181818181818
\end{equation*}

Here, you can check the calculations

In [3]:
def binom_coef(p,n):
    """ Computes binomial coefficient p from n """
    return np.math.factorial(n)/np.math.factorial(p)/np.math.factorial(n-p)
expected_value = 0 
S.sort() # sort S in ascending order
for i in range(len(S)-5):
    comb_nbr = binom_coef(5,12-i)
    expected_value += S[i] * comb_nbr
expected_value /= sum([binom_coef(5,12-i) for i in range(len(S)-5)])
print (f'Expected value = {expected_value}')

Expected value = 8.818181818181818
