In [3]:
# HIDDEN
import numpy as np

A *range* is an array of numbers in increasing or decreasing order, each separated by a regular interval. Ranges are defined  using the `np.arange` function, which takes either one, two, or three arguments. 

    np.arange(end): An array starting with 0 of increasing integers up to end
    np.arange(start, end): An array of increasing integers from start up to end
    np.arange(start, end, step): A range with step between each pair of consecutive values

A range always includes its `start` value, but does not include its `end` value. The `step` can be either positive or negative and may be a whole number or a fraction. 

In [4]:
by_four = np.arange(1, 100, 4)
by_four

array([ 1,  5,  9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65,
       69, 73, 77, 81, 85, 89, 93, 97])

Ranges have many uses. For instance, a range can be used to compute part of the Leibniz formula for π, which is typically written as

$$\pi = 4 \cdot \left(1 - \frac{1}{3} + \frac{1}{5} - \frac{1}{7} + \frac{1}{9} - \frac{1}{11} \dots\right)$$

In [5]:
4 * sum(1 / by_four - 1 / (by_four + 2))

3.1215946525910097

### The Birthday Problem

There are `k` students in a class. What is the chance that at least two of the students have the same birthday?

*Assumptions*

1. No leap years; every year has 365 days
2. Births are distributed evenly throughout the year
3. No student's birthday is affected by any other (e.g. twins)

Let's start with an easy case: `k` is 4. We'll first find the chance that all four people have different birthdays. 

In [6]:
all_different = (364/365)*(363/365)*(362/365)

The chance that there is at least one pair with the same birthday is equivalent to the chance that the birthdays are not all different. Given the chance of some event occurring, the chance that the event does not occur is one minus the chance that it does. With only 4 people, the chance that any two have the same birthday is less than 2%.

In [7]:
1 - all_different

0.016355912466550215

Using a range, we can express this same computation compactly. We begin with the numerators of each factor.

In [8]:
k = 4
numerators = np.arange(364, 365-k, -1)
numerators

array([364, 363, 362])

Then, we divide each numerator by 365 to form the factors, multiply them all together, and subtract from 1.

In [9]:
1 - np.prod(numerators/365)

0.016355912466550215

If `k` is 40, the chance of two birthdays being the same is much higher: almost 90%!

In [10]:
k = 40
numerators = np.arange(364, 365-k, -1)
1 - np.prod(numerators/365)

0.89123180981794903

Using ranges, we can investigate how this chance changes as `k` increases. The `np.cumprod` function computes the cumulative product of an array. That is, it computes a new array that has the same length as its input, but the `i`th element is the product of the first `i` terms. Below, the fourth term of the result is `1 * 2 * 3 * 4`.

In [11]:
ten = np.arange(1, 9)
np.cumprod(ten)

array([    1,     2,     6,    24,   120,   720,  5040, 40320])

The following cell computes the chance of matching birthdays for every class size from 2 up to 365. Scrolling through the result, you will see that as `k` increases, the chance of matching birthdays reaches 1 long before the end of the array. In fact, for any `k` smaller than 365, there is a chance that all `k` students in a class can have different birthdays. The chance is so small, however, that the difference from 1 is rounded away by the computer.

In [12]:
numerators = np.arange(364, 0, -1)
chances = 1 - np.cumprod(numerators/365)
chances

array([ 0.00273973,  0.00820417,  0.01635591,  0.02713557,  0.04046248,
        0.0562357 ,  0.07433529,  0.09462383,  0.11694818,  0.14114138,
        0.16702479,  0.19441028,  0.22310251,  0.25290132,  0.28360401,
        0.31500767,  0.34691142,  0.37911853,  0.41143838,  0.44368834,
        0.47569531,  0.50729723,  0.53834426,  0.5686997 ,  0.59824082,
        0.62685928,  0.65446147,  0.68096854,  0.70631624,  0.73045463,
        0.75334753,  0.77497185,  0.79531686,  0.81438324,  0.83218211,
        0.84873401,  0.86406782,  0.87821966,  0.89123181,  0.90315161,
        0.91403047,  0.92392286,  0.93288537,  0.9409759 ,  0.94825284,
        0.9547744 ,  0.96059797,  0.96577961,  0.97037358,  0.97443199,
        0.97800451,  0.98113811,  0.98387696,  0.98626229,  0.98833235,
        0.99012246,  0.99166498,  0.99298945,  0.99412266,  0.9950888 ,
        0.99590957,  0.99660439,  0.99719048,  0.99768311,  0.9980957 ,
        0.99844004,  0.99872639,  0.99896367,  0.99915958,  0.99

The `item` method of an array allows us to select a particular element from the array by its position. The starting position of an array is `item(0)`, so finding the chance of matching birthdays in a 40-person class involves extracting `item(40-2)` (because the starting item is for a 2-person class).

In [13]:
chances.item(40-2)

0.891231809817949