# NUMPY

Unlike R/MATLAB, Python relies on libraries for numerics.

- No builtin types for numeric computation
- However, packages like `numpy` are _quasi-standard_

## Basic array type

`numpy.array`, which is a multi-dimensional array of numbers.

In [None]:
import numpy as np # <- import a library, like include/require in other languages

A = np.array([
    [0,1,2],
    [2,3,4],
    [4,5,6],
    [6,7,8]])
A

In [None]:
print(A[0,0])
print(A[0,1])
print(A[1,0])

## Why do we need numpy?

Couldn't we just use lists?

In [None]:
A = np.array([1,2,3])
B = [1, 2, 3]

1. numpy arrays have extra numeric methods.
2. efficiency
3. expressiveness


In [None]:
A

In [None]:
A.mean()

In [None]:
A.std()

In [None]:
A.max()

You can also use numeric operations with arrays, they work **element-wise**:

In [None]:
A + 1

In [None]:
A * 2

Operations with two arrays also work **element-wise**:

In [None]:
B = np.array([1,1,2])
A

In [None]:
A + B

In [None]:
A * B

## Matrix/vector operations

In [None]:
A = np.array([
            [1,0,1],
            [0,2,0],
            [0,0,1]
])
B = np.array([1,2,3])

print(np.dot(A, B))

## Numpy arrays can be very efficient

A list in Python is an array of pointers to objects:


![python list](./images/python-list.svg)

while a numpy array really does hold its data in memory:

![numpy array](./images/numpy-array.svg)

Computers are **really good** at processing contiguous blocks of memory.

These arrays can also be passed to other libraries (including those written in C or FORTRAN).

## Timing measurements

One simple example (using the magic command `%timeit`):

In [None]:
a = list(range(1024))
%timeit sum(a)

In [None]:
b = np.arange(1024)
%timeit b.sum()

Actually, not that big of a difference, but the difference gets larger for larger arrays and more complex operations:

In [None]:
a = list(range(1024*1024))
%timeit sum(v*v for v in a)

In [None]:
b = np.arange(1024*1024)
%timeit (b**2).sum()

Now, it starts to matter.

## Numpy arrays are *homogeneous*

- All members of an array have the same type
- Either integer or floating pooint
- Defined **when you first create the array**

In [None]:
A = np.array([0, 1, 2]) # <- IMPLICIT TYPE
A.dtype

In [None]:
B = np.array([0.5, 1.1, 2.1])
B.dtype

In [None]:
C = np.array([0, 1, 2], dtype=np.float64) # <- EXPLICIT TYPE
C.dtype

Besides the speed, it is also more expressive.

## Numpy data types

- `np.int8`, `np.int16`, `np.int32`, `np.int64`
- `np.uint8`, `np.uint16`, `np.uint32`, `np.uint64`
- `np.float32`, `np.float64`, `np.float16`, (and, sometimes, `np.float128`)
- `np.bool`

Note that these can over/underflow:

In [None]:
A = np.array([1,2,3], np.uint8)
A - 10

## Reduce along axis operations

If you have a multidimensional array, you can reduce it along one of its axis:

In [None]:
A = np.array([
    [0,0,1],
    [1,2,3],
    [2,4,2],
    [1,0,1]])

In [None]:
A.max(axis=0)

In [None]:
A.max(1)

In [None]:
A.mean(axis=0)

## Slicing

In [None]:
A = np.array([
    [0,1,2],
    [2,3,4],
    [4,5,6],
    [6,7,8]])

A.shape

In [None]:
A[0]

In [None]:
A[0].shape

In [None]:
A[1]

In [None]:
A[:,2]

## Slices share memory!

A slice is a *view* into another array:

In [None]:
A

In [None]:
B = A[0]
B[0] = -1
A

## Argument passing is by reference

In [None]:
def double_array(A):
    A *= 2
A = np.arange(20)
double_array(A)

A

You need to be careful, but you can always make a copy:

In [None]:
A = np.arange(20)
B = A[0:10].copy()
double_array(B)
print(A)
print(B)

## Logical Arrays

Arrays of booleans:

In [None]:
A = np.array([-1,0,1,2,-2,3,4,-2])
A > 0

In [None]:
( (A > 0) & (A < 3) ).mean()

## Logical indexing

In [None]:
A[A < 0] = 0
# or

A *= (A > 0)
A

## Some helper functions

In [None]:
np.zeros((10,10))

In [None]:
np.ones(10)

In [None]:
A = np.array([1,2,3,4,5])
B = np.zeros_like(A)
B