# Intro to NumPy

NumPy provides efficient storage and manipulation for **numerical** arrays. NumPy arrays are like Python's built in list type. Anything can be thought of as an n-dimensional array.
* Digital Images: two dimensional arrays of numbers representing pixel brightness across area.
* Sound clips: one dimensional arrays of intensity vs time.


In [1]:
import numpy as np

## Creating NumPy Arrays from Scratch
Using routines built into numpy is a quick and efficient way to build arrays.

In [2]:
# Create an array filled with zeroes
np.zeros(10, dtype=int)

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [3]:
# Create a nxn matrices of only ones
n = 3
np.ones((n,n), dtype=float)

array([[ 1.,  1.,  1.],
       [ 1.,  1.,  1.],
       [ 1.,  1.,  1.]])

In [4]:
# Create 3x5 matrix with a parameter(s)
x = 5
np.full((3,5), x)

array([[5, 5, 5, 5, 5],
       [5, 5, 5, 5, 5],
       [5, 5, 5, 5, 5]])

In [5]:
# Create array of n values evenly spaced between 0 and 1
n = 5
np.linspace(0, 1, n)

array([ 0.  ,  0.25,  0.5 ,  0.75,  1.  ])

In [6]:
# Create nxn matrix of uniformly distributed random values between 0,1
n = 3
np.random.random((n,n))

array([[ 0.74307657,  0.20648467,  0.2057672 ],
       [ 0.98554723,  0.7288012 ,  0.91423251],
       [ 0.0778825 ,  0.53669756,  0.28672899]])

In [7]:
# Create a 3x3 array of random intergers in range [0,10]
np.random.randint(0, 10, (3,3))

array([[3, 5, 9],
       [2, 9, 4],
       [8, 1, 4]])

In [8]:
# Create a nxn identity matrix
n=5
np.eye(n, dtype=int)

array([[1, 0, 0, 0, 0],
       [0, 1, 0, 0, 0],
       [0, 0, 1, 0, 0],
       [0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1]])

## NumPy Array Attributes
Before we discuss numpy array attributes we will define three arrays from one to three dimensions.

In [9]:
np.random.seed(0) #seed the rng

x1 = np.random.randint(10, size=6) # One dimensional array size 6
x2 = np.random.randint(10, size=(3,4)) # Two-dimensional
x3 = np.random.randint(10, size=(3,4,5)) # Three-dimensional

Each numpy array has the following attributes
  * ndim: Number of dimensions
  * shape: Size of each dimension
  * size: total size of array
  * dtype: data type of the array
      * remember that numpy arrays have uniform data types
  * Byte size:
      * itemsize: size (in bytes) of each array element
      * nbytes: total size (in bytes) of the array
     

In [10]:
print("x3 ndim: ", x3.ndim)
print("x3 shape: ", x3.shape)
print("x3 size: ", x3.size)
print("x3 dtype: ", x3.dtype)
print("x3 itemsize: ", x3.itemsize)
print("x3 nbytes: ", x3.nbytes)

x3 ndim:  3
x3 shape:  (3, 4, 5)
x3 size:  60
x3 dtype:  int64
x3 itemsize:  8
x3 nbytes:  480


# Array Indexing: Accessing Elements
NumPy arrays are similar to python arrays/lists and indexing will feel somewhat familiar.

In [11]:
x1

array([5, 0, 3, 3, 7, 9])

In [12]:
x1[0]

5

In [13]:
#index from end of the array
x1[-1]

9

For n-dimensional arrays, items can be accessed by a comma separated -tuple of indices.

In [14]:
x2[0,2]

2

In [15]:
x3[0,1,1]

4

Just like normal python lists, entries in numPy arrays can be modified. However, numPy arrays have a fixed type so any modifications must adhere to the array dtype.

In [16]:
x2[0,2] = 4
x2[0,2]

4

## Array slicing, subarrays
Square bracket notation can be extended to access subarrays within arrays using the following notation:

<center>x[start:stop:step]</center>

With default values of start=0, stop=*size of dimension*, step=1

### One-dimensional subarrays

In [17]:
x = np.arange(10)
x[:5] #first five elements

array([0, 1, 2, 3, 4])

In [18]:
x[5:] #last five elements

array([5, 6, 7, 8, 9])

In [19]:
x[2:8] #middle subarray

array([2, 3, 4, 5, 6, 7])

In [20]:
x[::2] #every other element

array([0, 2, 4, 6, 8])

In [21]:
x[::-1] #every element reversed

array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])

### Multi-dimensional subarrays

In [22]:
x2

array([[3, 5, 4, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])

In [24]:
x2[:2, :3] #two rows, three columns

array([[3, 5, 4],
       [7, 6, 8]])

In [25]:
x2[:3, ::2] #all rows, every other column

array([[3, 4],
       [7, 8],
       [1, 7]])

Multidimensional arrays can even be reversed together

In [26]:
x2[::-1, ::-1]

array([[7, 7, 6, 1],
       [8, 8, 6, 7],
       [4, 4, 5, 3]])

### Subarrays as no-copy views
One crucial difference between numPy arrays and python lists/arrays is that they return *views* instead of *copies* of arrays. What does this mean?


In [27]:
print(x2)

[[3 5 4 4]
 [7 6 8 8]
 [1 6 7 7]]


Notice what happens when we extract a 2 x 2 subarray and change it's values.

In [30]:
x2_sub = x2[:2, :2] #two rows, two columns
print(x2_sub)

[[3 5]
 [7 6]]


In [31]:
x2_sub[0,0] = 5
print(x2_sub)

[[5 5]
 [7 6]]


What do we think happened to the original array?

In [32]:
print(x2)

[[5 5 4 4]
 [7 6 8 8]
 [1 6 7 7]]


Notice how the original array's subarray changed when we changed x2_sub. This default behavior allows us access and process pieces of large datasets without the need to copy the underlying data. 

## Creating copies of arrays
If we wanted to copy arrays instead of used the features of views we simply use the copy() method.

In [33]:
x2_sub_copy = x2[:2, :2].copy()
print(x2_sub_copy)

[[5 5]
 [7 6]]


This allows us to modify the copy without changing the original array. Notice how this differs from normal python arrays. It is NOT the default behavior.

In [35]:
x2_sub_copy[0,0] = 99
print(x2_sub_copy)

[[99  5]
 [ 7  6]]


In [36]:
print(x2)

[[5 5 4 4]
 [7 6 8 8]
 [1 6 7 7]]


## Reshaping of Arrays
A useful operation of arrays is reshaping arrays using the reshape() method. For example, making a 3x3 array of numbers 1,10 is simply rearraging a list of numbers 1,10.

In [37]:
grid = np.arange(1,10).reshape(3,3)
grid

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

For obvious reasons, this only works if the size of the original array matches the size of the reshaped array.

A common use for the reshape() method is to create 2 dimensional arrays out of 1 dimensional ones.

In [39]:
x = np.array([1,2,3]) # x is a one dimensional array

#row vector via reshape
x.reshape(1,3) 

array([[1, 2, 3]])

As you can see, x is now a two dimensional array

In [40]:
#column vector via reshape
x.reshape(3,1)

array([[1],
       [2],
       [3]])

# Array Concatenation
What happens if we want to combine arrays?

Concatenation is primarily accomplished by using the following routines:
  * np.concatenate
  * np.vstack - vertical stacking of arrays
  * np.hstack - horizontal stacking of arrays

np.concatenate takes a tuple or list of arrays as its first argument

In [42]:
x = np.array([1,2,3])
y = np.array([4,5,6])
np.concatenate([x, y])

array([1, 2, 3, 4, 5, 6])

Also works for more than two arrays

In [43]:
z = [100, 99, 98]
np.concatenate([x, y, z])

array([  1,   2,   3,   4,   5,   6, 100,  99,  98])

Can also be used for two dimensional arrays

In [45]:
grid = np.array([[1,2,3],[4,5,6]])
grid

array([[1, 2, 3],
       [4, 5, 6]])

In [46]:
#concatenate over first axis
np.concatenate([grid, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [1, 2, 3],
       [4, 5, 6]])

In [47]:
#concatenate over second axis
np.concatenate([grid, grid], axis=1)

array([[1, 2, 3, 1, 2, 3],
       [4, 5, 6, 4, 5, 6]])

But what do we do if we are dealing with arrays of varying dimensions?

If we want to work with multiple dimensions arrays must either be stacked vertically or horizontally.

In [49]:
x = np.array([1,2,3])
grid = np.array([[4,5,6,],[7,8,9]])

#vertically stack arrays
np.vstack([x, grid])

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [51]:
y = np.array([[99], [99]])

#horizontally stack the arrays
np.hstack([grid, y])

array([[ 4,  5,  6, 99],
       [ 7,  8,  9, 99]])

In order to properly stack the arrays the row/column size must match.