## Setup
First we check that we have the appropriate libraries installed in your virtual environment. If you're running this notebook outside of your virtual environment, close it and:

activate your virtual environment at the command line with

`source [virtualenv dir]/bin/activate`

after activation, you should see the virtual environment name to the left hand side of your terminal, e.g.,

`(venv_name) PythonForMATLABUsers >> ` 

Now you can run `jupyter notebook` from terminal and try to execute the following code. If the import statements fail, you don't have the libraries installed in your virtual environment. To fix this, close the notebook, execute 

`pip install numpy scipy sympy matplotlib` 

from the terminal and relaunch `jupyter notebook`

In [5]:
import numpy

In [7]:
# let's create a vector of zeros by calling the numpy zeros function
z = numpy.zeros(10)
print(z)

[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]


Note that to use any numpy function, we need to call it with a "numpy" prefix, like `numpy.zeros(10)`, `numpy.ones(10)` etc. This can become unwiedly to type if we need to include it every time we call a numpy function. There must be a better way.

We could have imported `numpy` a few ways:

```python
# import numpy without a "name"
# call everything using numpy.function(args)
import numpy
z = numpy.zeros(10)

# import all functions from numpy, no name needed
from numpy import *
z = zeros(10)
```

The `from numpy import *` seems nice, because we don't need to refer to numpy at all, and *for small scripts*, this can be very convenient. But for large projects, this is not a good idea, because functions in the numpy namespace could conflict with functions from other libraries imported with `*`, or custom functions you wrote yourself. For example, suppose you have a small script that calculates output power of some device. You might want to define a power function --we'll get to function definitions later
```python
def power():
    # do power calculation
    return power
```
but if you imported numpy with `from numpy import *`, the `numpy.power()` function, which allows you to raise vectors to different powers and which is now simply called as `power()`, will be overwritten and we can no longer use it! We have inadvertently polluted our namespace. There must be a better way.


Python lets us import libraries with a 'nickname' so we don't have to type `numpy.function()` every time, but we also keep the namespace separate. Now we have the best of both worlds.

```python
# BEST PRACTICE
# import numpy, but give it a name for ease of typing
import numpy as np
z = np.zeros(10)
```

## Tour of the `numpy` array

When we call `np.zeros`, we are initializing our first numpy array object. The numpy array is the central feature of the library.

These objects are multidimensional arrays which all hold the same type (usually a `float` or `int`). The entries of a numpy array are stored contiguously in memory, which makes accessing and operating on `numpy.ndarray` objects fast and generally much more efficient than more general Python data structures like lists or dictionaries.

Let's explore the `np.zeros` function more. We can initialize the array with different numeric types if we'd like. 

In [17]:
z_float = np.zeros(5) # filled with float by default
print('z_float: {}'.format(z_float))

z_int = np.zeros(5, dtype=int)
print('z_int: {}'.format(z_int))

z_bool = np.zeros(5, dtype=bool)
print('z_bool: {}'.format(z_bool))

z_float: [ 0.  0.  0.  0.  0.]
z_int: [0 0 0 0 0]
z_bool: [False False False False False]


### array shapes
Numpy array objects all support the 'shape' method, which allows us to see the shape of our arrays. It returns a tuple specifying the shape along each dimension.

In [48]:
# declare vector, 2D array, and multidim array of zeros
z_vector = np.zeros(5)
z_2D_vector = np.zeros((5,1))
z_2D = np.zeros((5,5))
z_multidim = np.zeros((6,7,2))

print('z_vector.shape: {}'.format(z_vector.shape))
print('z_2D_vector.shape: {}'.format(z_2D_vector.shape))
print('z_2D.shape: {}'.format(z_2D.shape))
print('z_multidim.shape: {}'.format(z_multidim.shape))

z_vector.shape: (5,)
z_2D_vector.shape: (5, 1)
z_2D.shape: (5, 5)
z_multidim.shape: (6, 7, 2)


An important note is that numpy distinguishes a vector from a 2D array with a single row or column. Note the difference in shape between `z_vector` and `z_2D_vector`. They have different dimensions, and cannot be treated as the same object:

In [49]:
print('z_vector: {}'.format(z_vector))
print('z_2D_vector:\n {}\n'.format(z_2D_vector))
print('z_2D_vector + z_vector:\n {}'.format(z_vector + z_2D_vector))
print('\nunexpected behavior!')

z_vector: [ 0.  0.  0.  0.  0.]
z_2D_vector:
 [[ 0.]
 [ 0.]
 [ 0.]
 [ 0.]
 [ 0.]]

z_2D_vector + z_vector:
 [[ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]
 [ 0.  0.  0.  0.  0.]]

unexpected behavior!


This can lead to bugs, so be sure to be careful whether you are using a 2D representation of a vector, or an actual numpy vector! Let's illustrate this with the `np.dot` function. 

For 2-D arrays `np.dot` is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation).

In [53]:
# this should fail, even though we are trying to 'dot' two vectors
# because one is a (5,) shape and the other is a (5,1) shape
np.dot(z_2D_vector, z_vector)

ValueError: shapes (5,1) and (5,) not aligned: 1 (dim 1) != 5 (dim 0)

In [59]:
# of course, it's fine if we dot two things of same shape
print(np.dot(z_vector, z_vector))

# or if we transpose z_2D_vector to make z^T * z
# in numpy, we can transpose an array with array.T
print(np.dot(z_2D_vector.T, z_2D_vector))

0.0
[[ 0.]]


To remove singleton dimensions, we can call `np.squeeze` just like `squeeze` in MATLAB

In [78]:
b = np.zeros((10,1))
print('b.shape: {}'.format(b.shape))
print( 'b.squeeze().shape: {}'.format(b.squeeze().shape))


b.shape: (10, 1)
b.squeeze().shape: (10,)


### non-zero initializations

Although we've been using the `np.zeros` function, `numpy` provides many ways to initialize arrays similar to MATLAB.

In [65]:
# np.ones: behaves same as np.zeros
print( np.ones((3,4)) )

[[ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]]


In [66]:
# np.linspace
print( np.linspace(3, 4, 5) )

[ 3.    3.25  3.5   3.75  4.  ]


In [73]:
# random array
print( np.random.random((2,3)) )
print()

# random vector
print( np.random.rand(3) )

[[ 0.17692761  0.32985797  0.94537789]
 [ 0.43159242  0.23583615  0.34222707]]

[ 0.05907334  0.52607722  0.37428888]


In [69]:
# initialize ones_like / zeros_like another array
arr = np.random.random((3,4))
b = np.zeros_like(arr)
print(b)

[[ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]
 [ 0.  0.  0.  0.]]


In [82]:
# 'counting' vectors 
print( np.arange(10) )
print( np.arange(0, 10, step=2) )

[0 1 2 3 4 5 6 7 8 9]
[0 2 4 6 8]


In [89]:
# identity matrix 
print('create identity matrix')
print( np.eye(3) )
print()

# create diag matrix
d = np.random.rand(3)
print('create diagonal matrix from vector')
print( np.diag(d) )
print()

# extract matrix diagonal
random_matrix = np.random.random((4,4))
print('extract matrix diagonal')
print( np.diag(random_matrix) )

create identity matrix
[[ 1.  0.  0.]
 [ 0.  1.  0.]
 [ 0.  0.  1.]]

create diagonal matrix from vector
[[ 0.73192843  0.          0.        ]
 [ 0.          0.77172521  0.        ]
 [ 0.          0.          0.93236   ]]

extract matrix diagonal
[ 0.62348371  0.97365792  0.04926858  0.95929805]


### array indexing
#### vectors
indexing works similar to indexing for Python lists

In [109]:
# vector indexing
arr = np.arange(10)
print('arr: {}'.format(arr))

arr: [0 1 2 3 4 5 6 7 8 9]


In [110]:
# 0 based indexing into individual elements
print('arr[0] = {}'.format(arr[0]))

arr[0] = 0


In [114]:
# index into ranges, with optional step
print('arr[5:7] = {}'.format(arr[5:7]))
print('arr[0:10:2] = {}'.format(arr[0:10:2]))

arr[5:7] = [5 6]
arr[0:10:2] = [0 2 4 6 8]


In [116]:
# get last element, index backward from last element
print('arr[-1] = {}'.format(arr[-1]))
print('arr[-2] = {}'.format(arr[-2]))

arr[-1] = 9
arr[-2] = 8


In [117]:
# slice to end
print('arr[3:] = {}'.format(arr[3:]))

arr[3:] = [3 4 5 6 7 8 9]


In [118]:
# slice from beginning to index
print('arr[:3] = {}'.format(arr[:3]))

arr[:3] = [0 1 2]


#### multidimensional arrays
same rules as for vectors, but with more "slots"

In [141]:
# create vector 1-16 and reshape to a (4,4) 2D array
arr = np.arange(16).reshape((4,4))
print('arr: \n{}'.format(arr))

arr: 
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]


In [133]:
# index into particular entry
print('arr[0,2] = {}'.format(arr[0,2]))

arr[0,2] = 2


In [137]:
# row, col slice: note objects returned by slicing are vectors!
print('arr[2, :] = {}'.format(arr[2,:]))
print('arr[:, 2] = {}'.format(arr[:,2]))

arr[2, :] = [ 8  9 10 11]
arr[:, 2] = [ 2  6 10 14]


In [139]:
# submatrix
print('arr[1:3, 1:3] = \n{}'.format(arr[1:3, 1:3]))

arr[1:3, 1:3] = 
[[ 5  6]
 [ 9 10]]


In [145]:
# with index vectors
idx = np.array([0,3])
print('arr[idx, :] = \n{}'.format(arr[idx, :]))

arr[idx, :] = 
[[ 0  1  2  3]
 [12 13 14 15]]


In [155]:
# with boolean mask to select corners
mask = np.ones((4,4), dtype=bool)
print('mask = np.ones((4,4), dtype=bool) \n{}'.format(mask))

mask[1:3, :] = 0
mask[:, 1:3] = 0
print('\nmask[1:3, :] = 0\nmask[:, 1:3] = 0 \n{}'.format(mask))

print('\narr[mask] \n{}'.format(arr[mask]))

mask = np.ones((4,4), dtype=bool) 
[[ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]

mask[1:3, :] = 0
mask[:, 1:3] = 0 
[[ True False False  True]
 [False False False False]
 [False False False False]
 [ True False False  True]]

arr[mask] 
[ 0  3 12 15]


In [157]:
# index by condition
print('arr > 3 \n{}'.format(arr > 3))
print('arr[arr > 3] \n{}'.format(arr[arr > 3]))

arr > 3 
[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]
 [ True  True  True  True]]
arr[arr > 3] 
[ 4  5  6  7  8  9 10 11 12 13 14 15]


#### Array Concatenation

Just like in MATLAB with `[A, B]` or `[A: B]`, it is possible to concatenate arrays in `numpy` with `np.concatenate`, `np.vstack`, and `np.hstack`, but in general, array concatenation is not a great practice, because to keep entries contiguous in memory, a new array will be allocated and entries from the old arrays copied into it. This can be a killer, especially if you are concatenating arrays in a loop as a misguided way to dynamically grow the arrays.

Therefore, for performance reasons, I generally avoid using array concatenation unless I absolutely need to. Instead, try to allocate an array of the correct size from the start and simply fill it. If you need a datastructure that can be dynamically resized, then consider the standard Python lists and dictionaries.

In [176]:
# let's check the time of dynamically adding to a list vs array
import time

# time list append
a = [1, 2, 3]
start = time.clock()
for i in range(100000):
    a.append(1)
end = time.clock()
print('elapsed: {}'.format(end-start))

# time array concatenate
arr = np.array([1, 2, 3])
arr2 = np.array([1])

start = time.clock()
for i in range(100000):
    np.concatenate((arr, arr2))
end = time.clock()
print('elapsed: {}'.format(end-start))

elapsed: 0.012590000000000323
elapsed: 0.10311599999999999


On my machine, I see about a factor of 10 speedup by using lists instead of numpy arrays! Be careful in dealing with absolutes like "numpy arrays are faster than lists". 

## Operating on `numpy` arrays

`numpy` implements many common mathematical operations which can be efficiently applied to numpy vectors and arrays. Operating on `numpy` arrays is best done by calling these functions on arrays rather than iterating over the entries and applying them individually.

Examples are `abs`, `exp`, `sin` etc.

In [179]:
# take the sine of an array
x = np.linspace(0, 2*np.pi, 6)
sinx = np.sin(x)
print(sinx)

[  0.00000000e+00   9.51056516e-01   5.87785252e-01  -5.87785252e-01
  -9.51056516e-01  -2.44929360e-16]


`numpy` arrays can be efficiently added, multiplied, etc. with standard operators. NOTE: `*` operator does element-wise multiplication, NOT matrix-multiplication like MATLAB. Think of `*` in `numpy` as `.*` in MATLAB.

In [185]:
arr = np.arange(9).reshape((3,3))
print('arr \n{}'.format(arr))
print()
print('arr * arr \n{}'.format(arr * arr))

arr 
[[0 1 2]
 [3 4 5]
 [6 7 8]]

arr * arr 
[[ 0  1  4]
 [ 9 16 25]
 [36 49 64]]


`np.dot` implements inner products, matrix-matrix, AND matrix-vector multiplication

In [192]:
x = np.ones(3, dtype=int)
print('arr \n{}\nx\n{}]n'.format(arr,x))
print('np.dot(arr, x) \n{}'.format(np.dot(arr, x)))

arr 
[[0 1 2]
 [3 4 5]
 [6 7 8]]
x
[1 1 1]]n
np.dot(arr, x) 
[ 3 12 21]
