# Numerical and Scientific Computing

## <img src='https://az712634.vo.msecnd.net/notebooks/python_course/v1/flask.png' alt="Smiley face" width="42" height="42" align="left">Learning Objectives
* * *
<br>
* Become comfortable defining, slicing and assigment with numpy arrays (1D and multidimensional)
* Become familiar with boolean indexing
* See some examples of set logic
* Get started with some linear algebra in numpy

## What is `numpy` and what does it do?
* numerical python operating upon what is called an "ndarray"
* ndarrays are 1 to N dimensional arrays or vectors
* operations are quickly vectorized on ndarrays

### 1D arrays

In [None]:
import numpy as np

# Here are three ways to create a 1D array in numpy
a = np.array([1, 2, 3, 4, 5])
a = np.array(range(1, 6))
a = np.arange(1, 6)

# what is a's type?  (use type(a))
# print a here...


<b>Slicing a 1D array</b>
* IMPORTANT: Slicing in NumPy does not return a copy, but rather what we call a <b>view</b>.
* If we modify the view, the original np array is <b>also modified</b>.
* If we truly want a copy we would use the `copy()` method.

<br>
Here's some proof that a <b>slice is a view for ndarrays</b> and changes to it will modify the original data:

In [None]:
a = np.array([1, 2, 3, 4, 5])

# let's hold on to the original
a_orig = a.copy()

# A slice (3rd through 4th elements)
view = a[2:4]

# Reassignment
view[1] = 0

# What changed if anything?  print "a" and "a_orig"


<b>Assignment via slicing</b>

In [None]:
a = np.array([1, 2, 3, 4, 5])
print(a)

# Set 3rd through 4th elements to a scalar (note: indexing is not inclusive of the last index number)
a[2:4] = 0
print(a)

### 2D arrays

In [None]:
a2d = np.array([range(5), range(5, 10), range(10, 15)])
print(a2d)

<b>Subscription</b>
* Two ways to access an element of 2D array

In [None]:
# First way
x = a2d[0][3]
print(x)

# What do you think the second way is?  If you like R, this is familiar.  
# Write some code here...


In [None]:
# Alternate subscription of 2D array

# This way
x = a2d[0, 3]
print(x)

# Or that
y = a2d[0, 3]
print(y)

<b>Using `reshape()`</b>
* Will conform a 1D array to an ND array

In [None]:
# Instead of (to create one array containing three arrays)
a2d = np.array([range(5), range(5, 10), range(10, 15)])

# We could just do this
a2d = np.arange(15).reshape(3, 5)

a2d

In [None]:
a2d = np.arange(0, 16).reshape(4, 4)

# We saw how to slice 1D arrays, e.g. a1d[1], now a2d[1] would be a 1D array

# If we want rows 2 and 3 we would use a range of indices
a2d[1:3]

In [None]:
# What about column 4?
a2d[:, 3]

EXERCISE 1: Initializing and slicing 2D arrays
* Initialize a 3x4 2D array with all even numbers between 1 and 25 (HINT: use `arange`'s argument `step`)
```python
# e.g. Even numbers starting at 0 to 10
np.arange(0, 10, step = 2)
```
* Slice out the 1st and 3rd element in the 1st row

In [None]:
# Code up your solution here...

### 3D arrays

In [None]:
# 2x2x5 array (explicitly initialized with literals)
a3d = np.array([[[1, 2, 3, 4, 5], 
                 [6, 7, 8, 9, 10]], 
                [[11, 12, 13, 14, 15], 
                 [16, 17, 18, 19, 20]]])
print(a3d)

# Or just do this
a3d = np.arange(1, 21).reshape(2, 2, 5)
print(a3d)

EXERCISE 2: Slicing 3D arrays
* Use what you've seen about slicing 1D and 2D arrays to:
  1. Slice out the first 2D array from the above 3D array
  2. Slice out the second 1D from the first 2D array
  3. Slice out the third element in the second 1D array from the first 2D array

In [None]:
# Code up your solution here...

### Indexing

<b>Boolean indexing</b>
* IMPORTANT: Boolean indexing creates a <b>copy</b> unlike slicing

In [None]:
# 7x4 2D array with some normally distributed random numbers
data = np.random.randn(7, 4)
print(data)

# Our boolean indexer
bindex = np.array([True, False, False, True, False, True, False])

# Are we slicing by rows or columns?

In [None]:
# let's slice with the boolean indexer now
data[bindex]

In [None]:
# How do you think you would reverse select with our boolean indexer?


# The answer is in solution to exercise 3

EXERCISE 3: Create a 2D boolean indexer and use it to slice a 2D array?
* Fill in the missing pieces to the following code to create this slicer.

Copy this code into a cell below and fill in
```python
# Exercise code - fill in the blank or missing parts below

import numpy as np

# 2x3 2D array with some normally distributed random numbers
data = np.random.___(2, 3)
print(data)

# our 2D boolean indexer
bindex = np.array([___]).reshape(2, 3)
print(bindex)

# Will it work?

# Also, apply the reverse of the boolean indexer to the data (check solution for this...)

```

In [None]:
# Try your solution here

<b>Booleans to set a swath of values</b>

In [None]:
data = np.random.randn(7, 4)
data[data < 0] = 0
data

In [None]:
# Set a whole row or set of rows to a scalar

# Our boolean indexer
bindex = np.array([True, False, False, True, False, True, False])

data[bindex] = 8
data

In [None]:
# Show how would you set a whole column (or set of columns) to a value using a boolean indexer
bindex = np.array([True, False, False, True])
data[:, bindex] = 5
data

<b>Additional indexing options we are not going to cover</b>
* Fancy indexing
  1. Selecting out particular rows in a specified order
  2. Selecting elements corresponding to tuples of indices
  3. Use of `np.ix_` function to select a square region

### Transposing

In [None]:
# Initialize 2D array with some random integers between 1 and 5
data = np.random.randint(1, 6, size = 12).reshape(3, 4)
print(data)

dataT = data.T # could also use transpose() method
print(dataT)

### Universal functions aka `ufuncs`

In [None]:
# 1D array
a = np.arange(10)
print(a)

# A couple of unary functions

sqroot = np.sqrt(a)
print(sqroot)

expon = np.exp(a)
print(expon)

In [None]:
# let's make a couple of 1D arrays
a = np.random.randn(10)
b = np.random.randn(10)

# A couple of binary funcions

maxes = np.maximum(a, b)
print(maxes)

added = np.add(a, b)
print(added)

> For more information on available `ufuncs` visit the scipy/numpy docs [here](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#available-ufuncs).  Also, just note `numpy` uses the concept of <b>broadcasting</b> when performing operations (like `+ - * /` and `ufuncs`) on ndarrays to decide how to handle disparately shaped inputs.  More info [here](http://docs.scipy.org/doc/numpy/reference/ufuncs.html#broadcasting).

### Conditional logic with `np.where`
* A great way to vectorize: `x if condition else y`
* In the following example, a boolean array (the condition), is used to choose an element either in the first 1D array or second 1D array.

In [None]:
# Two 1D arrays
a = np.arange(5)
b = np.arange(5, 10)

# A boolean indexer as the condition (decides from which 1D array to choose an element)
# i.e. if value is True - choose element from a, else choose element from b
cond = np.array([True, False, False, True, False])

In [None]:
# Old familiar way: list comprehension with our friendly zip function
result = [x if c else y for x, y, c in zip(a, b, cond)]
print(list(zip(a, b, cond)))
print(result)

# The better faster numpy way (especially for larger arrays!)
result = np.where(cond, a, b)
print(result)

# could cond be an anonymous function?  Try!

### Some set logic for 1D arrays
* Functions to find unique elements, unions, intersections, set differences, symmetric set differences

<b>`np.unique()` will return a sorted, unique array</b>

In [None]:
a = ['a'] * 10
b = ['b'] * 12
c = ['c'] * 5
d = ['d'] * 8
ones = [1] * 20
z = ['z'] * 6

arr = z + a + d + c + b + ones
print(arr)

np.unique(arr)

# Compare to using set methods or operators learnt about previously


<b>`np.intersect1d()` and `np.union1d()` will return sorted, unique intersection and union, respectively</b>

In [None]:
a = ['a'] * 10
b = ['b'] * 12
c = ['c'] * 5
d = ['d'] * 8

a1 = a + b + c
a2 = b + c + d
np.intersect1d(a1, a2)

# Show the union of a1 and a2...try this


EXERCISE 4:  Classic sets and ndarrays - a comparison exercise<br><br>
Using this array from above
```python
a = ['a'] * 100
b = ['b'] * 120
c = ['c'] * 50
d = ['d'] * 80
ones = [1] * 200
z = ['z'] * 60

arr = z + a + d + c + b + ones
```
Do the following:
* Create a classic set with it
* Sort it (HINT: you'll have to convert to a list of `str`)
* Convert it to an ndarray

For a fun bonus, compare an intersection using classic set methods versus numpy methods of the following two arrays:
```python
a1 = z + d + c + ones
a2 = a + d + b + ones
```

In [None]:
# Code up your solution here...

### Sorting NumPy arrays
* Unlike built-in `sort()`, `np.sort()` does <b>not</b> sort inline
* Like built-in `sort()`, `np.sort()` can take a second argument indicating axis upon which to apply sort
* `np.sort()` sorts on rows by default

In [None]:
# 2D array
a2d = np.random.randn(4 , 3)
print(a2d)

a2dsort = np.sort(a2d)
print(a2dsort)

# np.sort() sorts on rows by default

# To sort on columns
a2dsort = np.sort(a2d, 0)
print(a2dsort)

### Numpy and linear algebra

In [None]:
import numpy as np
# 2x3 array
x = np.array([[9., 3., 8.], [4., 7., 2.]])

# 3x2 array
y = np.array([[2., 4.], [-1, 9], [8, -5]])

print(x)
print(y)

# Two ways to get dot product (built-in vs. numpy way)
dot1 = x.dot(y)
dot2 = np.dot(x, y)

<b>Linear algebra library `numpy.linalg`</b>
* Like MATLAB and R, this library is (under the hood) implemented with Fortran libraries
* Also, this is just the very tip of the iceberg

In [None]:
from numpy.linalg import inv, eig

# let's take a numpy array and treat it like a matrix
X = np.random.randn(3, 3)
print('X:', X)

# Transpose
X = X.T
print('X.T:', X)

# Inverse matrix
Xi = inv(X)
print('X^-1:', Xi)

# Decomposition example: eigenvalues/eigenvectors
evalues, evectors = eig(X)
print('Eigenvalues: ', evalues)
print('Eigenvectors: ', evectors)

<b>Use `help()` to see all that `numpy.linalg` has to offer

In [None]:
import numpy as np
help(np.linalg)

Created by a Microsoft Employee.
	
The MIT License (MIT)<br>
Copyright (c) 2016