# Probability and counting
 
This notebook is the Python equivalent of the R code in section 1.8 R, [Introduction to Probability, 1st Edition](https://www.crcpress.com/Introduction-to-Probability/Blitzstein-Hwang/p/book/9781466575578), Blitzstein & Hwang.

----

## Vectors

Rather than using the usual Python list (array), for probability and statistics it is more convenient to use a [Numpy](https://docs.scipy.org/doc/numpy/user/basics.creation.html) array.

In Python, you can import the `numpy` module using the `np` shorthand alias idiom. Here we pass in a list of values to the [Numpy array](https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.array.html) method to obtain an instance of a Numpy array, and then print the array as follows:

In [1]:
import numpy as np

v = np.array([3, 1, 4, 5, 9])
print(v)

[3 1 4 5 9]


The `numpy` module (shortened to `np`) offers the following functions on an `array`:

* [`sum`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.sum.html) - sum array elements with respect to a given axis
* [`max`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.max.html#numpy.ndarray.max) - return the maximum value of the array with respect to a given axis
* [`min`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.min.html#numpy.ndarray.min) -  return the maximum value of the array with respect to a given axis

In [2]:
print(v.sum())

print(v.max())

print(v.min())

22
9
1


To obtain the number of elements in the `array`, you can use either:

* the Python built-in [`len`](https://docs.python.org/3.6/library/functions.html#len) function, passing in the `array`
* the [`size`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.size.html#numpy.ndarray.size) attribute of the Numpy `array`

In [3]:
print(len(v))

print(v.size)

5
5


To create an `array` from the sequence $(1, 2, ..., n)$, use the Numpy [`arange`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html) function. In general, an ascending-order sequence $(m, ..., n)$ can be generated with `arange(m, n+1)`. Note that the second argument `stop` to the `arange` function is _not_ inclusive, so for an ascending-order sequence you must increment `stop` by 1.

A descending-order sequence can be generated by providing $-1$ as the third argument `step` to the `arange` function. However, for the reason mentioned above, you must conversely _decrement_ the second argument `stop` if you want to include this value in the `array`.

In [4]:
m = 1
n = 10

v = np.arange(m, n+1)
print(v)

#v = np.arange(n, m-1, -1)
#print(v)

[ 1  2  3  4  5  6  7  8  9 10]


Like the Python `list`, Numpy `array` is zero-indexed. To access the `i`<sup>th</sup> entry of a `array` `v`, use `v[i-1`].

In [5]:
print(v[0])

print(v[9])

1
10


To access a subset of the `array` members, you can pass another Numpy `array` of the target indices to the indexer of your `array`.

You can also use a Numpy `array` of boolean values (`True`, `False`) to index your `array`, keeping any elements where the corresponding index is `True` and filtering out elements where the corresponding index is `False`.

In [6]:
# Numpy array for the indices 1, 3, 5
subset_ind = np.array([1, 3, 5])
print(v[subset_ind])

# boolean indexer: all True indices will be included in the subset,
# while all False indexes will be filtered out
# keep all EXCEPT for array values at index 2, 4, and 6
boolean_ind = np.array([True, True, False, True, False, True, False, True, True, True])
print(v[boolean_ind])

[2 4 6]
[ 1  2  4  6  8  9 10]


Many operations on Numpy `array` are interpreted _componentwise_.  For example, in math the cube of a vector doesn't have a standard definition, but

    v**3
    
simply cubes each entry individually.

In [7]:
print(v**3)

[   1    8   27   64  125  216  343  512  729 1000]


Similarly, 

    1 / (np.arange(1,101)**2

is a very compact way to get the vector $1, \frac{1}{2^2}, \frac{1}{3^2},  \ldots , \frac{1}{100^2}$

In math, $v+w$ is undefined if $v$ and $w$ are vectors of different lengths, but with a Numpy `array`, the shorter vector gets [broadcast](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html). For example, $v+3$ adds 3 to each entry of $v$.

In [8]:
print(v+3)

[ 4  5  6  7  8  9 10 11 12 13]


## Factorials and binomial coefficients 

## Sampling and simulation

## Matching problem simulation

## Birthday problem calculation and simulation

----

&copy; Blitzstein, Joseph K.; Hwang, Jessica. Introduction to Probability (Chapman & Hall/CRC Texts in Statistical Science).