### Learning about NumPy


In [1]:
import numpy as np

Turn regular list into numpy array with `np.array()`

Mean: `np.mean()`

Median: `np.median()`

Std Dev: `np.std()`

Some interesting numpy array operations are `np.concatenate`, `np.vstack`, `np.hstack`, and `np.dstack`. 

In [2]:
x = np.array([1,2,3])
y = np.array([4,5,6])
print(np.concatenate([x,y]))

[1 2 3 4 5 6]


In [3]:
# this is effectively the same as adding two lists together
print(np.array([1,2,3,] + [4,5,6]))


[1 2 3 4 5 6]


In [4]:
# np.vstack is for vertical stacking

print(np.vstack([x,y]))

[[1 2 3]
 [4 5 6]]


In [5]:
# np.hstack is for horizontal stacking

# this is the same as concatenation for 1d arrays, but changes for higher dimensions.
x = np.array([[1, 2, 3],
             [4, 5, 6]])
y = np.array([[7],
              [8]])
print(np.hstack([x, y]))
try:
    print(np.concatenate([x,y]))
except Exception as e:
    print('Error:', e)

[[1 2 3 7]
 [4 5 6 8]]
Error: all the input array dimensions except for the concatenation axis must match exactly


In [6]:
# np.dstack?

In [7]:
x = np.array([1,2,3])
y = np.array([7,8,9])
print(np.dstack([x,y]))
# weird.

[[[1 7]
  [2 8]
  [3 9]]]


Copy a numpy array with `.copy()` or `np.copy()`

Reshape with `.reshape((x,y))`, but make sure total # of elements is the same.


In [8]:
stuff = np.arange(1,20,2)
print(stuff)
print(stuff.reshape((2,5)))

[ 1  3  5  7  9 11 13 15 17 19]
[[ 1  3  5  7  9]
 [11 13 15 17 19]]


Split np arrays with `np.split`, `np.hsplit`, `np.vsplit`, and `np.dsplit`. Pass a list of indices giving the split points. 

In [9]:
print('stuff', stuff)
print(np.split(stuff, [3,9]))
a, b, c = np.split(stuff, [4,9])
print('regular split:', a, b, c)
a, b, c = np.hsplit(stuff, [4,9])
print('horizontal split:', a, b, c)
stuff = stuff.reshape((10,1))
print('reshaped stuff:', stuff)
a, b, c = np.vsplit(stuff, [4,9])
print('vertical split:', a, b, c)
stuff = stuff.reshape((1,2,5))
print('reshaped 3d stuff:', stuff)
a, b, c = np.dsplit(stuff, [1,5])
print('depth split:', a, b, c)

stuff [ 1  3  5  7  9 11 13 15 17 19]
[array([1, 3, 5]), array([ 7,  9, 11, 13, 15, 17]), array([19])]
regular split: [1 3 5 7] [ 9 11 13 15 17] [19]
horizontal split: [1 3 5 7] [ 9 11 13 15 17] [19]
reshaped stuff: [[ 1]
 [ 3]
 [ 5]
 [ 7]
 [ 9]
 [11]
 [13]
 [15]
 [17]
 [19]]
vertical split: [[1]
 [3]
 [5]
 [7]] [[ 9]
 [11]
 [13]
 [15]
 [17]] [[19]]
reshaped 3d stuff: [[[ 1  3  5  7  9]
  [11 13 15 17 19]]]
depth split: [[[ 1]
  [11]]] [[[ 3  5  7  9]
  [13 15 17 19]]] []


Use numpy UFuncs (universal functions) to do things with vectorized operations! Operators are the same as normal python arithmetic operators. Copied table:



Operator	Equivalent ufunc	Description
\+	np.add	Addition (e.g., 1 + 1 = 2)  
\-	np.subtract	Subtraction (e.g., 3 - 2 = 1)  
\-	np.negative	Unary negation (e.g., -2)  
\*	np.multiply	Multiplication (e.g., 2 * 3 = 6)  

/	np.divide	Division (e.g., 3 / 2 = 1.5)  
//	np.floor_divide	Floor division (e.g., 3 // 2 = 1)  
\*\*	np.power	Exponentiation (e.g., 2 ** 3 = 8)  
%	np.mod	Modulus/remainder (e.g., 9 % 4 = 1)  


In [None]:
def slow_mult(arr, val):
    arr = [val*x for x in arr]
    return arr
def fast_mult(arr, val):
    return val*arr

arr = np.array([1,2,3,4,5])
print('slow multiplication')
%timeit slow_mult(arr, 5)
print('fast vectorized multiplication')
%timeit fast_mult(arr, 5)

slow multiplication


| Function Name | NaN-safe Version | Description                               |   |   |
|---------------|------------------|-------------------------------------------|---|---|
| np.sum        | np.nansum        | Compute sum of elements                   |   |   |
| np.prod       | np.nanprod       | Compute product of elements               |   |   |
| np.mean       | np.nanmean       | Compute mean of elements                  |   |   |
| np.std        | np.nanstd        | Compute standard deviation                |   |   |
| np.var        | np.nanvar        | Compute variance                          |   |   |
| np.min        | np.nanmin        | Find minimum value                        |   |   |
| np.max        | np.nanmax        | Find maximum value                        |   |   |
| np.argmin     | np.nanargmin     | Find index of minimum value               |   |   |
| np.argmax     | np.nanargmax     | Find index of maximum value               |   |   |
| np.median     | np.nanmedian     | Compute median of elements                |   |   |
| np.percentile | np.nanpercentile | Compute rank-based statistics of elements |   |   |
| np.any        | N/A              | Evaluate whether any elements are true    |   |   |
| np.all        | N/A              | Evaluate whether all elements are true    |   |   |

made with http://www.tablesgenerator.com/markdown_tables
** broadcasting**
kind of intuitive, but link [here](https://jakevdp.github.io/PythonDataScienceHandbook/02.05-computation-on-arrays-broadcasting.html)
