# NumPy: a Tensor Library

<img src='images/scipyeco.png' width=601>

NumPy is a foundational part of Python's "SciPy" scientific computing stack.

NumPy provides
* Native-backed tensor data structures
* Tensor CRUD, including slicing and reshaping, random number generation
* Vectorized mathematical operations
* Buffers and APIs for other tools, such as Pandas or OpenCV, to use

## Getting Started

Although NumPy interacts smoothly with Python numbers and collections, NumPy is backed by native memory buffers, so it makes sense to keep our comput in NumPy as much as possible and not move data back and forth.

In [None]:
import numpy as np

# DON'T DO THIS:
my_list = [1, 2, 3]
numpy_array = np.array(my_list)
sum(list(numpy_array))

In [None]:
# DO THIS:

my_list = np.linspace(1, 3, 3)
my_list.sum()

We can allocate buffers and manipulate them:

In [None]:
a = np.zeros((3,3))

In [None]:
a

Tensor shape and data type are critical to manipulating arrays

In [None]:
a.shape

In [None]:
a.dtype

NumPy has strong support for Python indexing

Python indexing recap:
* ranges are denoted start:end and include the start index but exclude the end index
* start and/or end can be ommitted, with the meaning "everything before" (or after)

In [None]:
a[1,1] = 10
a

In [None]:
a[:, 2] = 1
a[2, :] = 3
a

NumPy functions should be preferred to Python functions and support *vectorization*

In [None]:
np.sqrt(a)

In [None]:
np.transpose(a)

Arrays can be reshaped as needed, provided the number of elements and axes stays the same.

*-1* in a shape-related instruction means "whatever is left after factoring across the other dimensions"

If you need to change the dimensionality ("axes"), you will want to use explicit instructions to do that.

In [None]:
a.reshape(1, 9)

In [None]:
a.reshape(3, -1)

In [None]:
b = np.expand_dims(a, 2)
b

In [None]:
a.shape

In [None]:
b.shape

You can reduce dimensionality via slicing

In [None]:
b[:,:,0]

In [None]:
b[:,:,0].shape

Another common use case is generating random values and initializing a tensor with them

In [None]:
random_vals = np.random.randn(3,3)
random_vals

When operating on multiple tensors, make sure they have the proper shapes. If not, the best case is a failure; the worst case is unintended *broadcasting* (which we're not going to go into here)

In [None]:
try:
    np.append(b, random_vals, axis=2)
except Exception as e:
    print(e)

In [None]:
random_vals.reshape(3,3,1)

In [None]:
c = np.append(b, random_vals.reshape(3,3,1), axis=2)
c

In [None]:
c.shape

In [None]:
c[:,:,0]

In [None]:
c[:,:,1]

## Faster calculation

NumPy operations are actually implemented in native code, so they are faster than Python equivalents

In [None]:
import math

py_million = list(range(1000000))
numpy_million = np.array(py_million)

In [None]:
%%timeit

py_sqrt = [math.sqrt(x) for x in py_million]

In [None]:
%%timeit

numpy_sqrt = [np.sqrt(x) for x in numpy_million]

__Hmmm__ ... What went wrong?

1. Overhead of moving data to/from Python -- let's keep it in NumPy
2. NumPy supports vectorization, or applying the operation to whole collections instead of iterating

In [None]:
%%timeit 

numpy_sqrt = np.sqrt(numpy_million)

Ok, that's better!

## Matplotlib image rendering from NumPy

Matplotlib, a foundational Python plotting package, can render NumPy arrays as images.

In [None]:
import matplotlib.pyplot as plt

plt.imshow(a)

In [None]:
plt.imshow(a, cmap='gray')

In [None]:
plt.imshow(c[:,:,1])

## NumPy Image Lab 1

1. Create a buffer for a 100x100 pixel grayscale image
2. Make the background 50% gray
3. Draw 5 concentric squares

*HINT* the intensity within the colormap is relative to the values present, so if you have two values, 10, and 20, and you're rendering in grayscale, the 10 will end up black and the 20 white. Sometimes this is what you want, but for image processing you may need to work around this issue.