# Numpy Demo
We'll go through some examples here to refresh the basics of numpy.  There are also plenty of other guides online:
* [Numpy quickstart tutorial](https://docs.scipy.org/doc/numpy-dev/user/quickstart.html)
* [Numpy for MATLAB users](https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html) -- great reference if you are used to working with MATLAB

First, to use a package we need to import it.

In [4]:
import numpy as np

# Motivation

Question:  Why bother with numpy?

Answer:  Numpy arrays are faster and more efficient than lists when working with numerical data.

## Example:  Pure Python vs Numpy
Let's compare the running time for a basic operation in
pure python and numpy.  The following code blocks
create two random matrices $A$ and $B$, then compute
$C = AB$.

### Pure Python
The first code block is with pure python using lists (no numpy).

In [5]:
import random
import time

def create_rand_matrix(n):
    # Create random matrix without numpy
    M = []
    for i in range(n):
        row = []
        for j in range(n):
            row.append(random.random())
        M.append(row)
    return M
    
start = time.time()

n = 100

A = create_rand_matrix(n)
B = create_rand_matrix(n)
    
# Compute C = AB
C = []
for i in range(n):
    row = []
    for j in range(n):
        sum = 0
        for k in range(n):
            sum += A[i][k] * B[k][j]
        row.append(sum)
    C.append(row)
    
stop = time.time()
print(stop - start)

0.3747551441192627


### Numpy
The same thing using numpy arrays.

In [6]:
start = time.time()

n = 100

A = np.random.random((n, n))
B = np.random.random((n, n))
C = A @ B

stop = time.time()
print(stop - start)

0.004807233810424805


# Creating Arrays
Numerous ways of creating arrays are available.

## Creating arrays from a list

In [7]:
vals_list = [1, 3, 2, 8]
vals_array = np.array(vals_list)

print("vals_list: ", vals_list)
print("vals_array: ", vals_array)

vals_list:  [1, 3, 2, 8]
vals_array:  [1 3 2 8]


## Creating arrays using built-in functions
* [np.arange()](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.arange.html)
* [np.linspace()](http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.linspace.html)
* How do we know how to call them?
    * See documentation
    * ipython help

### np.arange()
Evenly spaced numbers in a interval (meant for use with an integer step size).  Examples:

In [8]:
print(np.arange(10)) # stop
print(np.arange(4,12)) # start and stop
print(np.arange(4,12,2)) # start, stop, and step

[0 1 2 3 4 5 6 7 8 9]
[ 4  5  6  7  8  9 10 11]
[ 4  6  8 10]


### np.linspace()
Evenly spaced numbers in an interval.  Examples:

In [9]:
print(np.linspace(0,1,11))
print(np.linspace(0,10,11))

[0.  0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]


### Other special vectors
Often need to initialize to zeros or vector of all 1s

In [10]:
print(np.zeros(5))
print(np.ones(4))

[0. 0. 0. 0. 0.]
[1. 1. 1. 1.]


# Data Types
Recall, Python is dynamically typed.  Types are changed automatically as needed.  And, lists can hold anything.  A single list could hold strings and integers.

What about arrays?
Numpy arrays are statically typed.

So, what are the data types of the arrays we created above?  What are the available datatypes?  How do we specify what datatype we want? 

In [11]:
vals_list = [1,3,2,8]
vals_array = np.array(vals_list)
vals_arrayf = np.array(vals_list, dtype=float)

print("vals_array: ", vals_array)
print("vals_arrayf: ", vals_arrayf)

print(type(vals_list))
print(type(vals_array))
print(type(vals_arrayf))
print(vals_array.dtype)
print(vals_arrayf.dtype)

vals_array:  [1 3 2 8]
vals_arrayf:  [1. 3. 2. 8.]
<class 'list'>
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
int64
float64


The `dtype` argument is valid for most array-creation functions, including
`numpy.zeros`, `np.ones`, and `np.arange`.

In Python3, the `dtype` of an array that results from mathematical operations will
automatically adjust to whatever is sensible.

In [12]:
print('integers: ', vals_array)
print('more integers: ', vals_array * 3)
print('floats: ', vals_array / 3)

integers:  [1 3 2 8]
more integers:  [ 3  9  6 24]
floats:  [0.33333333 1.         0.66666667 2.66666667]


You can also copy an array and change the `dtype`.

In [13]:
arr = np.arange(10.0)
x = arr.astype(int)
print('arr: ', arr)
print('x: ', x)

arr:  [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
x:  [0 1 2 3 4 5 6 7 8 9]


# Accessing Array Elements
Now that we actually have arrays, how do we get things from them?
Indexed from 0, bracket notation for accessing

In [14]:
print(vals_arrayf)
print(vals_arrayf[0])

[1. 3. 2. 8.]
1.0


Negative indexing is also allowed.

In [15]:
print(vals_arrayf[-1])

8.0


What if I want a section of an array?  Array slicing.

In [16]:
print(vals_arrayf[1:3])
print(vals_arrayf[1:2])

[3. 2.]
[3.]


In addition to a start and end, you can also choose a step for the slice.

In [17]:
print(vals_arrayf)
print(vals_arrayf[::2])  # odd indices
print(vals_arrayf[1::2])  # even indices
print(vals_arrayf[::-1])  # handy way to reverse an array

[1. 3. 2. 8.]
[1. 2.]
[3. 8.]
[8. 2. 3. 1.]


# Copies vs. Views (Accidentally changing your array)

You need to be careful with `numpy` arrays if you are
* trying to copy part of an array, or
* passing an array to a function

You might be in for a nasty surprise if you change an element.

In [18]:
simple = np.arange(5)
small = simple[:2]
print(simple)
print('')
print(small)
print('')

small[0] = 7
print(small)
print('')
print(simple)  # shouldn't have changed, right?

[0 1 2 3 4]

[0 1]

[7 1]

[7 1 2 3 4]


This happens because `small` is something called a "view" of
`simple`, rather than a copy. This helps `numpy` save memory and
speed up your program, but it can lead to tricky bugs if it
is not your intent. In general, it can be difficult to tell
whether something will be a view or a copy.

Functions also do not make copies of their input arrays.

In [19]:
def foo(x):  # notice that x is not returned
    x[0] = 100


foo(simple)
print(simple)

[100   1   2   3   4]


If you think you are accidentally changing your array elsewhere in your code,
you can copy it to be on the safe side. This will be slow your program down
and use more memory, but it can help debugging and save a lot of headaches.

In [20]:
simple = np.arange(5)
print('before:')
print(simple)

my_copy = simple[:2].copy()
my_copy[1] = 10

foo(simple.copy())

print('after:')
print(simple)

before:
[0 1 2 3 4]
after:
[0 1 2 3 4]


# Multi-dimensional Arrays

*Note:* There is a `numpy.matrix` class, but you should avoid using it.
Use two-dimensional arrays instead.

How do we create multi-dimensional arrays?

### Creating from multi-dimensional lists

In [21]:
mat = np.array([[1,4,8],[3,2,9],[0,5,7]], float)
print(mat)
print('')

[[1. 4. 8.]
 [3. 2. 9.]
 [0. 5. 7.]]



### Creating special matrices
Like with 1d arrays, easy to create 2d arrays of 0s, 1s

In [22]:
print(np.zeros((2,3), dtype=float))
print('')
print(np.zeros_like(mat))
print('')
# np.zeros_like creates a matrix same shape, dimension, datatype as existing matrix
print(np.identity(3, dtype=float))

[[0. 0. 0.]
 [0. 0. 0.]]

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


How do we access multi-dimensional arrays?  First index gives row, second gives column.

In [23]:
print(mat)
print(mat[1,2])

[[1. 4. 8.]
 [3. 2. 9.]
 [0. 5. 7.]]
9.0


In [24]:
print(mat[0,0], mat[1,1], mat[2,2])

1.0 2.0 7.0


What about array slicing?  How do we get some subset of columns and/or rows?

In [25]:
print(mat[1,:]) # Single row, all columns

[3. 2. 9.]


In [26]:
print(mat[:,1]) # Single column, all rows

[4. 2. 5.]


In [27]:
print(mat[:,:2]) # All rows, subset of columns

[[1. 4.]
 [3. 2.]
 [0. 5.]]


What if we want an array of a different shape?
Can be useful to change shape of arrays.
This can be a convenient way of initializing matrices.

In [28]:
arr = np.arange(8)
two_four = arr.reshape(2, 4)
four_two = arr.reshape(4, 2)
eight_none = arr.flatten()
print('array:')
print(arr)
print('')
print('2 x 4:')
print(two_four)
print('')
print('4 x 2:')
print(four_two)
print('')
print('back to array:')
print(eight_none)
print(eight_none.shape)

array:
[0 1 2 3 4 5 6 7]

2 x 4:
[[0 1 2 3]
 [4 5 6 7]]

4 x 2:
[[0 1]
 [2 3]
 [4 5]
 [6 7]]

back to array:
[0 1 2 3 4 5 6 7]
(8,)


# Array functions
We'll go through some array functions here.  There are plenty more available.  Best way to find the function you want is to search on Google for what you want and find the documentation for it (there is probably a function that does what you want to do).

Length of an array.
For multi-dimensional this is length of first axis

In [29]:
new_mat = mat[:,:2]
print(vals_arrayf)
print(new_mat)
print(len(vals_arrayf))
print(len(new_mat))

[1. 3. 2. 8.]
[[1. 4.]
 [3. 2.]
 [0. 5.]]
4
3


What about shape?

In [30]:
print(new_mat.shape) # not a function

(3, 2)


Note:  shape is an attribute, not a function
(so it's not followed by parentheses).

### Sort array

What about sorting an array?
Two different methods, 
[np.sort()](http://docs.scipy.org/doc/numpy-1.10.1/reference/generated/numpy.sort.html) or 
[myarray.sort()](http://docs.scipy.org/doc/numpy-1.10.0/reference/generated/numpy.ndarray.sort.html).
One is a numpy function (called as np.sort()) and returns a copy of the array in sorted order.
The other one is a function of the array and sorts the array in place.

**Important point:**
* Some functions operate in place, others return copies.
* How do you know which you are using?  Look at the documentation.

In [31]:
print(np.sort(vals_arrayf)) # returns a copy
print(vals_arrayf)
vals_arrayf.sort() # inplace
print(vals_arrayf)

[1. 2. 3. 8.]
[1. 3. 2. 8.]
[1. 2. 3. 8.]


### Checking if items are in an array

In [32]:
print(9 in mat)
print(9 in vals_arrayf)

True
False


## Elementwise Array Operations

Lots of elementwise operations happen automatically with arrays. These include:
* addition
* subtraction
* multiplication
* division
* comparisons

Some examples:

In [33]:
mat_2 = np.array([[1,3],[2,5]], float)
id_2 = np.identity(2, float)

print(mat_2)
print('')
print(id_2)
print('')
print('sum:')
print(mat_2 + id_2)
print('')
print('difference:')
print(mat_2 - id_2)
print('')
print('product:')
print(mat_2 * id_2)  # not matrix multiplication
print('')
print('quotient:')
print(id_2 / mat_2)
print('power:')
print(mat_2**3)  # not matrix exponentiation
print('')
print(mat_2 == id_2)
print(' ')
print(mat_2 > id_2)

[[1. 3.]
 [2. 5.]]

[[1. 0.]
 [0. 1.]]

sum:
[[2. 3.]
 [2. 6.]]

difference:
[[0. 3.]
 [2. 4.]]

product:
[[1. 0.]
 [0. 5.]]

quotient:
[[1.  0. ]
 [0.  0.2]]
power:
[[  1.  27.]
 [  8. 125.]]

[[ True False]
 [False False]]
 
[[False  True]
 [ True  True]]


### Other Elementwise Array Functions

Many other element-wise functions to perform mathematical operations, round results, etc. also have numpy equivalents.

In [34]:
# Misc Math Operations
print(np.exp(id_2))
print(np.log2(vals_arrayf))
print(np.reciprocal(mat_2))

[[2.71828183 1.        ]
 [1.         2.71828183]]
[0.        1.        1.5849625 3.       ]
[[1.         0.33333333]
 [0.5        0.2       ]]


In [35]:
# Trig Functions
print(np.sin(mat_2))
print(np.tan(id_2))

[[ 0.84147098  0.14112001]
 [ 0.90929743 -0.95892427]]
[[1.55740772 0.        ]
 [0.         1.55740772]]


In [36]:
# Rounding
print(np.round(np.sin(mat_2), 2))

[[ 0.84  0.14]
 [ 0.91 -0.96]]


## Matrix and Vector Operations
If `*` is elementwise multiplication, how do we do other useful operations like matrix-vector product or inner product?

In [37]:
# Matrix Multiplication and Dot Product
print(np.dot(mat_2, id_2))
print('')
print(mat_2 @ id_2)
print('')
print(np.dot(vals_arrayf, np.array([0,2,6,1])))

[[1. 3.]
 [2. 5.]]

[[1. 3.]
 [2. 5.]]

30.0


Along those same lines, built-in functions exist to transpose matrices.

In [38]:
# Matrix transpose
print(np.transpose(mat_2))
print(mat_2.T)

[[1. 2.]
 [3. 5.]]
[[1. 2.]
 [3. 5.]]


# Broadcasting (Element-wise operations on arrays of different shapes)

Note: `numpy` tries to make simple broadcasting fairly intuitive.
However, if you are having trouble understanding what is going on, [reading scipy's short article](https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
can be helpful.

The simplest case is adding an array and a scalar.
What we want here should be fairly intuitive, and that is what `numpy` does.

In [39]:
bmat = np.arange(12).reshape(4, 3)
print(bmat)
print('')
print(3 + bmat)

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

[[ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]
 [12 13 14]]


What about an array and a vector? What do we even mean by that?

In [40]:
vec3 = np.arange(3)
print(vec3)
print('')
print(bmat + vec3)

[0 1 2]

[[ 0  2  4]
 [ 3  5  7]
 [ 6  8 10]
 [ 9 11 13]]


What if we want to add (or subtract, multiply, etc.) by row? We will need a vector of four elements for this to make sense.

In [41]:
vec4 = np.arange(4)
print(bmat + vec4)

ValueError: operands could not be broadcast together with shapes (4,3) (4,) 

It turns out that `numpy` is comparing the 4 with the 3 and doesn't like it.
For a detailed explanation of why, you can read the scipy artice above.

The gist is more or less that `numpy` prepends 1s to the size of your vector when broadcasting,
and each dimension of the two arrays either needs to be the same number or have one of the
numbers be a 1 for the operation to be valid.

In our case, one operand is (4, 3), and the other becomes (1, 4), so the 3 and the 4
in the last spot produce a `ValueError`.

So, how can we get around this?

In [42]:
print(bmat + vec4[:, np.newaxis])

[[ 0  1  2]
 [ 4  5  6]
 [ 8  9 10]
 [12 13 14]]


The `np.newaxis` in the second dimension forces `numpy` to put the 1 in that spot,
giving us a shape of (4, 1).  The (4, 1) shape lines up with the (4, 3) shape of
our matrix, so we are good to go.

Note that `np.newaxis` is not a command and does not have parentheses at the end.

# Reduction Operations
There are other operations beyond the common matrix/vector operations that do not return an array of the same shape as the input.
For example, you can find out the minimum or maximum value in the entire array,
or the sum of all entries.

In [43]:
print(bmat)
print(bmat.min())
print(bmat.max())
print(bmat.sum())

[[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]
0
11
66


What if I want the smallest number in every row?
All of these reduction operations take an optional `axis` argument that allows
us to target a particular dimension of the array.

In [44]:
print('row minimum:')
print(bmat.min(axis=1))
print('column minimum:')
print(bmat.min(axis=0))

row minimum:
[0 3 6 9]
column minimum:
[0 1 2]


Notice that when we pass an `axis` argument, we lose that dimension of our
array, but the shape is otherwise unchanged. So, a (4, 3) array becomes
a (3,) array if we pass `axis=0`, and it becomes a (4,) array if we
pass `axis=1`.

# Random Numbers

It can often be necessary to generate random numbers.
`numpy` can generate numbers from a variety of distributions,
and it can generate lots of them at once and put them in a convenient shape.

The [np.random](https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.random.html) documentation gives a helpful overview.

Common functions used are:
* [np.random.rand](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.rand.html#numpy.random.rand) (uniform),
* [np.random.randn](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.randn.html#numpy.random.randn) (normal),
* [np.random.randint](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.randint.html#numpy.random.randint) (integers).

All of these routines give you the option of generating an array of a specified shape.

Examples:

In [45]:
uniform_nums = np.random.rand(10)
print(uniform_nums)
print('')
normal_nums = np.random.randn(3, 5)
print(normal_nums)
print('')
integers = np.random.randint(0, 10, (4, 2))
print(integers)
print('')

[0.6547803  0.67290584 0.32344985 0.32566901 0.98575444 0.79288886
 0.84329287 0.02835835 0.20946802 0.09798485]

[[ 0.3101997   1.28268904 -0.27361291 -0.13634871 -0.25373209]
 [ 0.60767192  0.59099515  0.60850301 -0.12297693  0.68525586]
 [-0.87977991  1.24707837  1.1964423   0.43796593 -0.47681014]]

[[6 3]
 [9 0]
 [3 9]
 [9 3]]

