# Numpy Shapes

In particular we will see: 

- how you can create numpy arrays 
- how you can reshape arrays 
- what consequences this might have for linear algebra 
- indices in numpy 

## Why Bother 

We'll be dealing with numpy arrays a lot and probably the most annoying part of dealing with these has to do with their shapes. 

# Comparison and Difference with Lists

Numpy arrays are very different from python lists. You can easily cast between the two, but you should notice they 
will behave differently around operators. 

In [None]:
import numpy as np

In [None]:
py_arr = [1, False, 1.]
np_arr = np.array(py_arr)

print(py_arr, type(py_arr))
print(np_arr, type(np_arr))

In [None]:
py_arr * 2

In [None]:
np_arr * 2

# Creating Arrays 

You can declare arrays very easily in numpy. Here are some common initialization functions.

In [None]:
np.zeros(3)

In [None]:
np.ones(3)

In [None]:
np.zeros((3, 3))

In [None]:
np.ones((3, 3))

In [None]:
z = np.zeros((3,10))
np.ones_like(z)

In [None]:
np.full((3,3), fill_value=10)

In [None]:
z = np.zeros((3,10))
np.full_like(z, fill_value=5)

In [None]:
np.eye(3)

In [None]:
np.arange(1, 4)

In [None]:
np.arange(1, 4, 0.1)

In [None]:
np.linspace(start=1, stop=10, num=20)

In [None]:
np.logspace(start=1, stop=7, num=7, base=10)

# Structured Arrays

Apart from regular arrays, numpy also allows you to create structured arrays. Structured arrays allow us to mix datatypes in a matrix.

In [None]:
cities = np.array([
    ('Paris', 'France', 2240000), 
    ('Amsterdamn', 'The Netherlands', 820000)],
    dtype=[('city', 'U50'), ('country', 'U50'), ('population', 'i4')])


In [None]:
cities['city']

In [None]:
cities['population'] *= 2
cities

Although many times, we would rather use pandas in cases like this

# Shapes 

Let's change the sizes of certain objects in numpy.

In [None]:
np.arange(15)

In [None]:
np.arange(15).shape

In [None]:
np.arange(15).reshape((3, 5))

In [None]:
np.arange(15).reshape((3, 5)).shape

We can even reshape back to the original shape.

In [None]:
np.arange(15).reshape((3, 5)).reshape(15).shape

The shape of an array can change by calling the `.reshape` method. Note that this will not work if we original datastructure does not have a fitting datastructure.

In [None]:
# NBVAL_RAISES_EXCEPTION
np.arange(15).reshape((2,5))

Also note this subtle difference.

In [None]:
np.arange(15).reshape([3, 5]).reshape((1, 15)).shape

In [None]:
np.arange(15).reshape([3, 5]).reshape((15, 1)).shape

This change might feel very subtle, but it can have important consequences. Especially when we are going to use numpy to perform linear algebra. 

We'll show this in a moment, but we want to show a few other utility functions that can be used to change the shape of an ndarray.

In [None]:
np.arange(1, 4)

In [None]:
np.arange(1, 4).repeat(5)

In [None]:
np.arange(1, 4).repeat([1,5,10])

In [None]:
np.tile(np.arange(1,4), 2)

In [None]:
np.tile(np.arange(1,4), (3,4))

Another subtle topic which can cause headaches when dealing with numpy arrays are single-dimensional entries in an array.

In [None]:
print(np.zeros(3).shape)
print(np.zeros((3, 1)).shape)

These arrays do not have the same shape which can influence your computations. Thankfully, they are pretty easily transformed into one another.

For example by adding a new dimension to an existing array:

In [None]:
a = np.zeros(3)

# These have the same effect since `np.newaxis is None`
print(a[..., None].shape)
print(a[..., np.newaxis].shape)

Or by removing the single-dimensional entries from the other array

In [None]:
np.zeros((3,1)).squeeze().shape

# Linear Algebra 

Let's do a little bit of linear algebra to confirm that shapes matter. 

In [None]:
mat = np.arange(15).reshape([3, 5])
mat

In [None]:
mat.T

In [None]:
mat * mat

In [None]:
np.dot(mat.T, mat)

In [None]:
np.dot(mat, mat.T)

In python3.6 we have a nifty `@` operator for matrix multiplication. 

In [None]:
mat @ mat.T

But even with this, the matrices still need to fit.

In [None]:
# NBVAL_RAISES_EXCEPTION
mat @ mat

# Selections

NumPy also allows for indexing and slicing, but has some extra flexibilities.

In [None]:
arr = np.array([9, 4, 3, 5, 2, 1, 5, 3])
arr[1]

In [None]:
arr[1:5]

In [None]:
arr[1:5] = 3
arr 

In [None]:
arr == 3

(This is called a bitmask.)

In [None]:
arr[arr == 3]

In [None]:
arr[arr != 3]

Array slices can be multidimensional too!

In [None]:
orig = np.random.normal(loc=0, scale=1, size=[4,4])
orig

In [None]:
deriv = orig.copy()

In [None]:
deriv < 0

In [None]:
deriv[deriv < 0] = 0 

In [None]:
orig

In [None]:
deriv

# Axis 

In [None]:
mat.ndim

In [None]:
mat.reshape((3,5)).ndim

In [None]:
mat.max()

In [None]:
mat.max(axis=1)

In [None]:
mat.max(axis=0)