# Introduction to Computation and Python Programming

## Lecture 10

### Today
----------

- Numpy

### Vectorization

- Numpy is all about **Vectorization**
- Start thinking in terms of homogeneous, multidimensional arrays - **vectors**
- see code - start with an example

### Itertools

“module \[that\] implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML… Together, they form an ‘iterator algebra’ making it possible to construct specialized tools succinctly and efficiently in pure Python.”

- a module for creating complex iterators
- iterable - any Python object that implements `__iter__()` or `__getitem__()`. lists are iterables
- `iter()` built-in function returns an `iterator` when called on an iterable
- iterators are composable 
- "iterator algebra" - collection of building blocks that can be combined to form specialized "data pipelines"


### Itertools - example

- `better_grouper()` is better because it can take any iterable as an argument (even infinite iterators)
- by returning an iterator - can process large iterables uses much less memory
- in our example ~630x less memory, ~4x less time


### Random Walk with Iterators

- `accumulate` - make an iterator that returns accumulated sums, or accumulated results of other binary functions
- No loops
- Gained 85% of computation time compared to previous version without `itertools`

### Numpy Vectorization

- Easy to translate from **`itertools`** to **`numpy`**
- 500x gain in computation time

### Numpy Arrays

- A grid of values, all of the same type, indexed by a tuple of nonnegative integers
- Number of dimensions is the **rank** of the array
- Size along each dimension is the **shape** of the array
- see code
- Numpy provides many functions to create arrays
- [Array Indexing](https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html)
    - Slicing: Similar to Python lists - specify a slice for each dimension of the array
        - basic syntax: `start:stop:step`
        - Unlike slicing a Python list, in numpy, a slice is a view into the array - modifying it will modify the original array
    - Integer Indexing: Unlike slice (which is always a subarray of the original array), integer indexing can be used to create arbitrary arrays
        - interesting trick: mutating a single element in each row of a matrix (see code)

### Datatypes

- Every numpy array is homogeneous (same type of elements)
- Numpy provides a large set of numeric datatypes
- Numpy tries to guess the datatype but can also take explicit data type as an optional argument

### Array Math

- Basic math functions operate elementwise on arrays
- Available as operator overloads and as functions

### Other Array Manipulations

- Reshape arrays e.g. Transform using the `T` attribute of an array

### Broadcasting

- Work with arrays of different shapes
- Use smaller array multiple times to performn some operation on larger array
- Rules of Numpy broadcasting:
    - If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length
    - The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has 1 size in that dimension
    - The arrays can be broadcast together if they are compatible in all dimensions
    - After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays
    - In any dimension where one array at size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension
- Functions that support broadcasting are known as **universal functions**