# Problem 53

### Combinatoric selections

There are exactly ten ways of selecting three from five, 12345:

    123, 124, 125, 134, 135, 145, 234, 235, 245, and 345

In combinatorics, we use the notation, 5C3 = 10.

In general,

$nC_r = \frac{n!}{r!(n−r)!}$

where r ≤ n, n! = n×(n−1)×...×3×2×1, and 0! = 1.

It is not until n = 23, that a value exceeds one-million: 23C10 = 1144066.

How many, not necessarily distinct, values of  nCr, for 1 ≤ n ≤ 100, are greater than one-million?

### Solution

First we are going to cache the factorials. We are not going to need the factorials above $n/2$ ($n=100$) thanks to the implementation of *combinations* (see below).

In [1]:
from math import factorial

factorials = [factorial(n) for n in range(51)]

An efficient implementation would compute the whole factorial of the numerator. Since we are going to deal with big factorials, and since they are going to be divided by the denominator, we can avoid computing the big factorials.

The idea is to compute $n \times n - 1 \times ... \times n - k - 1$. Since *combinations(n, k) = combinations(n, n - k)*, we can change *k* with *n - k* if *k* is smaller than the second number. By doing that, we will have to multiply less numbers to make the numerator (the more $n-k$ is closer to $n$, the better).

For computing *factorial(k)*, we are going to use the cached numbers. Since by implementation $k$ will always be $\leq n/2$, we won't need to cache numbers greater than $100 / 2 = 50$.

In [2]:
from utils.math import prod_of_list

def combinations(n, k):
    
    if k == 0 or k == n:
        return 1
    
    if n - k < k:
        k = n - k
        
    num = prod_of_list([x for x in xrange(n, n - k, -1)])        
    den = factorials[k]
    
    return num // den

##### Solution 1

Iterate on all *n* and *k*

In [3]:
sum(1 for n in xrange(2, 101) for k in xrange(2, n + 1) if combinations(n, k) > 1000000)

4075

##### Solution 2

A more efficient solutions will stop when the first number above 1000000 is found; since the combinations are symmetric, we already know that only the combinations from $k$ to $n - k$ will be above 1000000.

Since the implementation of *combinations* is already very efficient and the numbers are not that big, I couldn't notice any performance improvement. If you set $n$ to 1000, the first solution doesn't yield a solutions even after some seconds, while this second solution returns immediately.

In [4]:
combinations_greater_than_one_million = 0

for n in xrange(2, 101):
    for k in xrange(2, n // 2 + 2):
        
        if combinations(n, k) > 1000000:
            combinations_greater_than_one_million += n - 2 * k + 1
            break
            
print combinations_greater_than_one_million

4075
