# Numpy Tutorial

For the rest of the assignments (and your final project), you will be using one of Python's scientific computing packages called _numpy_. This will be a short tutorial of numpy basics that you'll likely encounter in your assignments. However, you are not limited to using the numpy methods that are introduced in this tutorial – feel free to explore more complex numpy functions.

**Credit**: This tutorial is an interactive adaptation of the Quickstart tutorial on the SciPy Documentation website. All credit goes to the official SciPy Documentation page: https://docs.scipy.org/doc/numpy-dev/user/quickstart.html

## Resources

Here are some other helpful online resources that you can also review to learn numpy:

- https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf
- http://cs231n.github.io/python-numpy-tutorial/
- https://www.tutorialspoint.com/numpy/index.htm

In [1]:
import numpy as np
from timeit import Timer

## Why numpy?

Numpy is a package that contains many routines for fast matrix and vector operations. Behind the scenes, rather than executing slow Python code, numpy functions often execute code that is compiled and highly optimized. Let's take a look at a simple example.

Here, we create a 10000-element sequence of increasing numbers (0...9999) with two different approaches: one using numpy and one using regular Python list functions. We use Python Timers to find the time it takes to sum the elements in each sequence 10000 times.

In [2]:
# Create 10000-element sequences
x = np.arange(10000)
y = range(10000)

# Create timers and time sum function
numpy_time = Timer("x.sum()", "from __main__ import x").timeit(10000)
list_time = Timer("sum(y)", "from __main__ import y").timeit(10000)

# Visualize execution time difference
print("numpy execution time: %.3f" % numpy_time)
print("list execution time: %.3f" % list_time)

numpy execution time: 0.086
list execution time: 1.177


As you can see, the numpy function runs significantly faster. In your assignments, you'll be working with large amounts of data, so **using numpy will save you tons of time**. 

## 1: Basic Structure

Numpy works with multidimensional arrays. A numpy array is a table of elements (much like nested arrays in Python), except all elements are of the same type. 
 
Numpy's array class is called `ndarray` and some of its most useful attributes include:
 
`ndarray.ndim`: the number of axes (dimensions) of the array 

`ndarray.shape`: the dimensions of the array as a tuple _(m,n)_ for a matrix with _m_ rows and _n_ columns 

`ndarray.size`: the total number of elements in the array  

`ndarray.dtype`: the type of the elements in the array (i.e. `numpy.int32`, `numpy.float64`)

Let's use the above attributes to analyze the following array:

```
[[ 3., 4., 2.],
 [ 1., 0., 2.]]
 ```

In [2]:
a = np.array([[ 3., 4., 2.], [ 1., 0., 2.]])

# Take a look at the array
print("-- Array --\n{}".format(a), end="\n\n")
print("-- Array Attributes --")

# How many dimensions is the array?
print("Number of dimensions: {}".format(a.ndim))

# How many rows and columns are there?
print("Number of rows: {}".format(a.shape[0]))
print("Number of cols: {}".format(a.shape[1]))

# How many elements are in the array?
print("Number of elements: {}".format(a.size))

# What type is the array?
print("Array type: {}".format(type(a)))

# What type are the elements in the array?
print("Type of elements: {}".format(a.dtype))

-- Array --
[[3. 4. 2.]
 [1. 0. 2.]]

-- Array Attributes --
Number of dimensions: 2
Number of rows: 2
Number of cols: 3
Number of elements: 6
Array type: <class 'numpy.ndarray'>
Type of elements: float64


## 2: Printing Arrays

Numpy prints arrays in the following layout:

- the last axis is printed from left to right
- the second-to-last axis is printed from top to bottom
- the rest are also printed from top to bottom

One dimensional array example:

```
[0. 1. 2.]
```

Two-dimensional array example: 

```
[[0. 1. 2.]
 [3. 4. 5.]]
```

Three-dimensional array example:

```
[[[0. 1. 2.]]

 [[3. 4. 5.]]]
```

To demonstrate printing numpy arrays, we will use the following function:

`ndarray.reshape`: give a new shape to an array without changing its data.

In [7]:
# What happens when you print an array with a small number of elements?
print("-- Small Array --")
print(np.arange(5), "\n")

# What happens when you print an array with a large number of elements?
print("-- Large Array --")
print(np.arange(1000), "\n") 

# What happens when you print a multidimensional array with a large number of elements?
print("-- Large Multidimensional Array --")
print(np.arange(1000).reshape((200,5)), "\n")

-- Small Array --
[0 1 2 3 4] 

-- Large Array --
[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
  90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107
 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143
 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161
 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179
 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197
 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215
 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233
 

## 3: Creating Arrays

There are many different ways to create numpy arrays:

`np.array`: creates an array from an existing data type    

`np.zeros`: creates an array full of zeros  

`np.ones`: creates an array full of ones  

`np.empty`: creates an array with random elements (output varies - use with caution!)   

`np.arange`: creates sequences of numbers

### 3.1: Create an array from a regular Python list or tuple

In [12]:
# Single dimension int array
a = np.arange(5)
print("-- {} array--".format(a.dtype))
print(a, "\n")

# Single dimension float array
b = np.array([1.0,2.0])
print("-- {} array--".format(b.dtype))
print(b, "\n")

# What type is this array?
c = np.array([[1,2], [3.0, 4.5]])
print("-- {} array--".format(c.dtype))
print(c, "\n")

# Multidimensional array
d = None
print("-- {} array--".format(None))
print(d, "\n")

-- int64 array--
[0 1 2 3 4] 

-- float64 array--
[1. 2.] 

-- float64 array--
[[1.  2. ]
 [3.  4.5]] 

-- None array--
None 



### 3.2: Specifying array dtype

You can specify the `dtype` of an array during creation time.

In [11]:
a = np.array([(1.5,2,3), (4,5,6)]) 
b = np.array([[1.5,2,3], [4,5,6]])

# What do you think happens?
print("-- Comparing arrays a & b --")
print(a == b, "\n")

# Specify array dtype as int
print("-- Array c --")
c = np.array([[1.5,2,3], [4,5,6]], dtype = int)
print(c, "\n")

# What do you think happens now?
print("-- Comparing arrays b & c --")
print(b == c, "\n")

-- Comparing arrays a & b --
[[ True  True  True]
 [ True  True  True]] 

-- Array c --
[[1 2 3]
 [4 5 6]] 

-- Comparing arrays b & c --
[[False  True  True]
 [ True  True  True]] 



### 3.3: Create an array with autopopulated cells

For the `zeros`, `ones`, and `empty` methods, you can specify the `shape` and `dtype` of the array. If you don't specify a specific  `dtype`, the array is by default `float64`.

In [15]:
# Create an array full of zeros
a = np.zeros((2,4))
print("-- Zeros Array --")
print(a, "\n")

# Create an array full of ones
print("-- Ones Array --")
b = np.ones((2,3))
print(b, "\n")

# Create an array of random values (uninitialized)
print("-- Empty (Random) Array --")
c = np.empty((4,3))
print(c, "\n")

-- Zeros Array --
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]] 

-- Ones Array --
[[1. 1. 1.]
 [1. 1. 1.]] 

-- Empty (Random) Array --
[[ 0.00000000e+000  0.00000000e+000  0.00000000e+000]
 [ 0.00000000e+000  0.00000000e+000  0.00000000e+000]
 [ 0.00000000e+000  0.00000000e+000  0.00000000e+000]
 [ 0.00000000e+000  2.68156159e+154 -1.73059562e-077]] 



### 3.4: Create an array as a sequence of numbers

In [17]:
# Create array using only end value
print("-- Array created using only end value --")
a = np.arange(5)
b = None
print(a)
print(b, "\n")

# Create array using start / end values
print("-- Array created using start / end values --")
c = np.arange(2,7)
print(c, "\n")

# Create array using start / end / step values
print("-- Array created using start / end / step values --")
d = np.arange(2,10,3)
print(d, "\n")

-- Array created using only end value --
[0 1 2 3 4]
None 

-- Array created using start / end values --
[2 3 4 5 6] 

-- Array created using start / end / step values --
[2 5 8] 



## 4: Indexing, Slicing, and Iterating

Similar to normal Python lists, you can perform indexing, slicing, and iterating operations on numpy arrays.

In [18]:
# Single dimension array operations
print("-- Indexing / slicing on single dimension array ---")
a = np.arange(10)
print(a)
print(a[2]) # Index
print(a[2:5]) # Slice
print(a[2:5:2]) # Slice with skips
print()

print("-- Iterating through a single dimension numpy array --")
# Iterate

-- Indexing / slicing on single dimension array ---
[0 1 2 3 4 5 6 7 8 9]
2
[2 3 4]
[2 4]

-- Iterating through a single dimension numpy array --


In [23]:
# Multidimensional operations
print("-- Indexing / slicing on multidimensional array ---")
b = np.array([[1,2,3], [4,5,6], [7,8,9]])
print(b)
print(b[:]) # Full slice
print(b[1,2]) # Index
print(b[1:2, 1:]) # Slice
print()

print("-- Iterating through rows of a multidimensional numpy array --")
# Iterate
for row in b:
    print(row)

-- Indexing / slicing on multidimensional array ---
[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[1 2 3]
 [4 5 6]
 [7 8 9]]
6
[[5 6]]

-- Iterating through rows of a multidimensional numpy array --
[1 2 3]
[4 5 6]
[7 8 9]


## 5: Important Numpy Operations

For this course, it will be important to get familiar with some of the basic numpy matrix operations. In this section, we will introduce the numpy operations that you will likely need when completing future assignments.

### 5.1: Basic Arithmetic Operations

Arithmetic operations in numpy are applied element-wise to the array. Note especially that this also applies to multiplication, unlike other matrix languages. Similarly, two arrays with matching dimensions can be operated on.

In [26]:
# Operating on arrays and constants
print("-- Array-constant operations --")
a = np.array([1,2,3,4])
print(a+4) # Add
print(a-4) # Subtract
print(a*4) # Multiply
print(a/4, "\n") # Divide

# Operating on arrays
print("-- Array-array operations --")
b = np.array([1,2,3,4])
print(a+b) # Add
print(a-b) # Subtract
print(a*b) # Multiply
print(a/b, "\n") # Divide

# Using numpy array operations  thisis faster
print("-- Using numpy array operations --")
print(np.add(a,b)) # Add
print(np.subtract(a,b)) # Subtract
print(np.multiply(a,b)) # Multiply
print(np.divide(a,b)) # Divide

-- Array-constant operations --
[5 6 7 8]
[-3 -2 -1  0]
[ 4  8 12 16]
[0.25 0.5  0.75 1.  ] 

-- Array-array operations --
[2 4 6 8]
[0 0 0 0]
[ 1  4  9 16]
[1. 1. 1. 1.] 

-- Using numpy array operations --
[2 4 6 8]
[0 0 0 0]
[ 1  4  9 16]
[1. 1. 1. 1.]


### 5.2: Dot Product

Numpy has a built-in function for computing the dot product (inner product) of the array / matrix:

`np.dot(a,b)`: computes the inner product between a and b. There are several different syntactic ways of expressing inner products. 

Suppose we have two arrays _a_ and _b_:

```
a = [1, 2, 3, 4]
b = [0, 1, 2, 3]
```

A dot product of arrays _a_ and _b_ is calculated as follows:

```
(1 * 2) + (2 * 3) + (3 * 4) = 20
```


In [28]:
a = np.array([1,2,3,4])
b = np.array([0,1,2,3])

# What type of result do you get when you multiply two arrays?
print("-- Element-wise multiplication of two arrays --")
print(a*b, "\n") 

# What type of result do you get when you find the dot product of two arrays?
print("-- Dot Product of two arrays --")
print(np.dot(a,b))
print(a.dot(b))
print(b.dot(a))
print()

-- Element-wise multiplication of two arrays --
[ 0  2  6 12] 

-- Dot Product of two arrays --
20
20
20
None


### 5.3: Useful Math Operations

Here are some mathematical operations you may find useful:

`np.sin(a)`: takes the sine of _a_

`np.cos(a)`: takes the cosine of _a_

`np.sqrt(a)`: takes the square root of _a_

`np.tan(a)`: takes the tangent of _a_

`np.exp(a)`: computes _$e^a$_ 

`np.sqrt(a)`: return the positive square-root of an array, element-wise

In [29]:
# Numpy Math Operations
print("-- Numpy Math Operations --")
a = np.array([1,2,3,4])
print(np.sin(a)) # sin
print(np.cos(a)) # cos
print(np.tan(a)) # tan
print(np.sqrt(a)) # sqrt
print(np.exp(a)) # exp

-- Numpy Math Operations --
[ 0.84147098  0.90929743  0.14112001 -0.7568025 ]
[ 0.54030231 -0.41614684 -0.9899925  -0.65364362]
[ 1.55740772 -2.18503986 -0.14254654  1.15782128]
[1.         1.41421356 1.73205081 2.        ]
[ 2.71828183  7.3890561  20.08553692 54.59815003]


### 5.4: Other Useful Operations

`np.sum(a)`: computes the sum of all elements in the array 

To compute the summation along the rows or columns, add an axis parameter:

- `axis=0` refers to COLUMNS

- `axis=1` refers to ROWS

In [32]:
a = np.array([[1,2],[3,4]])

# Find summation of all elements in array
print("-- Summation of entire array --")
print(np.sum(a), "\n")

# Find summation of elements along the columns
print("-- Summation along columns --")
print(np.sum(a, axis = 0), "\n")

# Find summation of elements along the rows
print("-- Summation along rows --")
print(np.sum(a, axis = 1), "\n")

-- Summation of entire array --
10 

-- Summation along columns --
[4 6] 

-- Summation along rows --
[3 7] 



`np.mean(a)`: computes the average of the entire matrix

In [33]:
a = np.array([[3, 4, 2], [1, 0, 2]])

# Find mean of all elements in array
print("-- Mean of entire array --")
print(np.mean(a), "\n")

# Find mean of elements along the columns
print("-- Mean along columns --")
print(np.mean(a, axis = 0), "\n")

# Find mean of elements along the rows
print("-- Mean along rows --")
print(np.mean(a, axis=1), "\n")

-- Mean of entire array --
2.0 

-- Mean along columns --
[2. 2. 2.] 

-- Mean along rows --
[3. 1.] 



`np.log2(a)` computes the log in base 2 of all elements of a.

In [34]:
a = np.array([1,2,3])
print(np.log2(a))

[0.        1.        1.5849625]


`np.copy(a)` is useful for copying arrays. It is different from assignment.

In [36]:
# Incorrect array copying method
print("-- Incorrect way to copy arrays --") # b will be changed as a changes
a = np.array([1,2,3])
b = a
print("Array a before:", a)
a[1] = 0
print("Array a after:", a)
print("Array b:", b)
print()

# Using numpy copy function
print("-- Using numpy copy function --")
a = np.array([1,2,3])
b = np.copy(a)
print("Array a before:", a)
a[1] = 0
print("Array a after:", a)
print("Array b:", b)

-- Incorrect way to copy arrays --
Array a before: [1 2 3]
Array a after: [1 0 3]
Array b: [1 0 3]

-- Using numpy copy function --
Array a before: [1 2 3]
Array a after: [1 0 3]
Array b: [1 2 3]


`np.argsort(a)`: sorts _a_ and returns the indices of the sorted array

In [39]:
# Sort array
print("-- Sort array --")
a = np.array([3, 1, 2])
print(np.argsort(a),  "\n")

# Sort array online accesses
b = np.array([[4, 1, 2], [3, 6, 5]])

print("-- Sort array along columns --")
print(np.argsort(b, axis = 0), "\n")

print("-- Sort array along rows --")
print(np.argsort(b, axis = 1))

-- Sort array --
[1 2 0] 

-- Sort array along columns --
[[1 0 0]
 [0 1 1]] 

-- Sort array along rows --
[[1 2 0]
 [0 2 1]]


`np.argmin(a)`: return the indices of the minimum values along an axis  

`np.argmax(a)`: return the indices of the maximum values along an axis

In [43]:
a = np.arange(1,7).reshape(2,3)
print("-- Original matrix --")
print(a, "\n")

# By default, index is into flattened array
print("-- Flattened array --")
print("Min:", np.argmin(a))
print("Max:", np.argmax(a), "\n")

# Indices of max values along columns
print("-- Along Columns --")
print("Min:", None)
print("Max:", None, "\n")

# Indices of max values along rows
print("-- Along Rows --")
print("Min:", None)
print("Max:", None)

-- Original matrix --
[[1 2 3]
 [4 5 6]] 

-- Flattened array --
Min: 0
Max: 5 

-- Along Columns --
Min: None
Max: None 

-- Along Rows --
Min: None
Max: None


`np.linalg.norm(a)`: return a matrix norm (2-norm by default)

In [20]:
a = np.array([1,2,3])
print(None)

None


`np.transpose`: permute the dimensions of an array

In [44]:
a = np.arange(4).reshape((2,2))
print("-- Normal Matrix --")
print(a, "\n")

print("-- Transposed Matrix --")
print(np.transpose(a))

-- Normal Matrix --
[[0 1]
 [2 3]] 

-- Transposed Matrix --
[[0 2]
 [1 3]]


`np.intersect1d`: find the intersection of two arrays

In [54]:
a = np.intersect1d([1, 3, 4, 3], [3, 1, 2, 1])
b = np.intersect1d([[1, 3], [4, 3]],[ [3, 1], [2, 1]])
print(a)
print(b)

[1 3]
[1 3]


`np.union1d`: find the union of two arrays

In [23]:
a = np.union1d([1, 3, 2], [5, 2, 4])
print(None)

None


`np.fill_diagonal`: Fill the main diagonal of the given array with some value

In [49]:
a = np.ones((4, 4))
print("-- Before Matrix --")
print(a, "\n")

print("-- After Matrix --")
# TODO: Fill diagonal here
np.fill_diagonal(a,2)
print(a)

-- Before Matrix --
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]] 

-- After Matrix --
[[2. 1. 1. 1.]
 [1. 2. 1. 1.]
 [1. 1. 2. 1.]
 [1. 1. 1. 2.]]
