## Python Numpy

### Table of Contents
* [About SciPy](#scipy)

* References:
* [NumPy reference](https://docs.scipy.org/doc/numpy/reference/)
* [NumPy Basics](https://docs.scipy.org/doc/numpy/user/basics.html)
* [Stanford CS231n : Python Numpy Tutorial](http://cs231n.github.io/python-numpy-tutorial/)
* [Stanford CS228 : Notebook: Python Numpy Tutorials ](https://github.com/kuleshov/cs228-material/blob/master/tutorials/python/cs228-python-tutorial.ipynb)

## About SciPy

SciPy is a Python based ecosystem of open source software for Mathematics, Science and Engineering.
Following are some of the core packages:

* NumPy - Base N dimensional array
* Pandas - Data structures and analysis
* Matplotlib - plotting
*

In [1]:
import numpy as np

In [3]:
arr = np.array([1, 2, 3])  # 1D array

print(arr)  
print(arr.shape)  # size of that dimension
print(arr[1])

[1 2 3]
(3,)
2


In [6]:
mat = np.array([[1,2,3], [4,5,6]]) # 2D array - matrix
print mat
print(mat.shape)
print(mat[0,0])

[[1 2 3]
 [4 5 6]]
(2, 3)
1


In [13]:
i = 2
j = 3
k = 4
arr_3d = np.zeros((i, j, k)) # 3D array

arr_3d[0][0][0] = 1
arr_3d[0][0][1] = 2
arr_3d[0][0][2] = 3
arr_3d[0][0][3] = 4

print(arr_3d)

print(arr_3d.shape)  # gives the size in each dimension.
print(len(arr_3d.shape)) # gives the rank / number of dimensions of the N-dim array

[[[ 1.  2.  3.  4.]
  [ 0.  0.  0.  0.]
  [ 0.  0.  0.  0.]]

 [[ 0.  0.  0.  0.]
  [ 0.  0.  0.  0.]
  [ 0.  0.  0.  0.]]]
(2, 3, 4)
3


## Create Arrays

In [14]:
a1 = np.zeros((2,2)) # 2x2 matrix with zeros
print(a1)

[[ 0.  0.]
 [ 0.  0.]]


In [15]:
a2 = np.ones((2,1))  # matrix with ones
print(a2)

[[ 1.]
 [ 1.]]


In [16]:
a3 = np.full((3,4), 5) #constant matrix 
print(a3)

[[ 5.  5.  5.  5.]
 [ 5.  5.  5.  5.]
 [ 5.  5.  5.  5.]]




In [18]:
a4 = np.eye(2) # identity matrix
print(a4)

[[ 1.  0.]
 [ 0.  1.]]


## Array Elementwise Math Operations

In [22]:
x = np.array([[1,2], [3,4]], dtype=np.float64)
y = np.array([[5,6],[7,8]], dtype=np.float64)

print(type(x))
print(x)
print(type(y))
print(y)

<type 'numpy.ndarray'>
[[ 1.  2.]
 [ 3.  4.]]
<type 'numpy.ndarray'>
[[ 5.  6.]
 [ 7.  8.]]


In [23]:
print(x+y)
print(np.add(x,y))

[[  6.   8.]
 [ 10.  12.]]
[[  6.   8.]
 [ 10.  12.]]


In [24]:
print(x-y)
print(np.subtract(x,y))

[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


In [25]:
print(x*y)
print(np.multiply(x,y))

[[  5.  12.]
 [ 21.  32.]]
[[  5.  12.]
 [ 21.  32.]]


In [26]:
print(x/y)
print(np.divide(x,y))

[[ 0.2         0.33333333]
 [ 0.42857143  0.5       ]]
[[ 0.2         0.33333333]
 [ 0.42857143  0.5       ]]


In [27]:
print(np.sqrt(x))

[[ 1.          1.41421356]
 [ 1.73205081  2.        ]]


In [30]:
print(x)
print(np.sum(x))  # sum of all elements
print(np.sum(x, axis=0))  # column wise sum
print(np.sum(x, axis=1))  # row wise sum

[[ 1.  2.]
 [ 3.  4.]]
10.0
[ 4.  6.]
[ 3.  7.]


In [31]:
print(x)
print(x.T) # Transpose

[[ 1.  2.]
 [ 3.  4.]]
[[ 1.  3.]
 [ 2.  4.]]


## Broadcasting

It is an optimization process to perform the arithmetic operations faster with N dimensional array.
For example, if you want to add vector to a matrix, one way to do is do it iteratively. It takes more time.
Instead, **Broadcasting** stacks up the vector to the size of matrix and does matrix summation. Thus, improves the runtime performance.


In [34]:
arr_b = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]])
print(arr_b)
vect = np.array([1,0, 1, 0])
print(vect)

result = arr_b + vect  ## broadcasting.
print(result)

[[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]
[1 0 1 0]
[[ 2  2  4  4]
 [ 6  6  8  8]
 [10 10 12 12]
 [14 14 16 16]]


## Misc functions

In [38]:
x = np.arange(1,5)  # range of values from 1 to 5, exlude 5
print(x)
x2 = np.arange(1.0, 8.0, 2.0) # use step
print(x2)
x3 = np.arange(5)  # start is optional, step is by default 1
print(x3)
x4 = np.arange(2.0)
print(x4)

[1 2 3 4]
[ 1.  3.  5.  7.]
[0 1 2 3 4]
[ 0.  1.]


## Random number generation using Numpy

In [44]:
y1 = np.random.rand(3,2) # generate 3x2 matrix of random numbers
print(y1)

y2 = np.random.randn(3,2) #use normal distribution
print(y2)

y3 = np.random.randint(2,20,10)  # low, high(inclusive) & size. - generate size number of random numbers between low & high
print(y3)

[[ 0.55092111  0.34497153]
 [ 0.94088855  0.61080355]
 [ 0.21494819  0.98226827]]
[[-1.05014625 -0.1745795 ]
 [-0.81198759 -0.82851676]
 [-1.02064802  0.68235553]]
[10 12 14  5 10  7  9 17 19 13]


In [5]:
import numpy as np

np.random.seed(1234)

my_mat = np.array(range(1,121)).reshape(12,10)

print(my_mat)

# do stratfied sampling of 5 elements  in each column
# ie., without replacement - do not repeat same number
for col in range(my_mat.shape[1]):
    print('Group {} {}' .format(col+1, np.random.choice( my_mat[:, col], 5, replace=False)))
    


[[  1   2   3   4   5   6   7   8   9  10]
 [ 11  12  13  14  15  16  17  18  19  20]
 [ 21  22  23  24  25  26  27  28  29  30]
 [ 31  32  33  34  35  36  37  38  39  40]
 [ 41  42  43  44  45  46  47  48  49  50]
 [ 51  52  53  54  55  56  57  58  59  60]
 [ 61  62  63  64  65  66  67  68  69  70]
 [ 71  72  73  74  75  76  77  78  79  80]
 [ 81  82  83  84  85  86  87  88  89  90]
 [ 91  92  93  94  95  96  97  98  99 100]
 [101 102 103 104 105 106 107 108 109 110]
 [111 112 113 114 115 116 117 118 119 120]]
Group 1 [ 71  81 111  21  91]
Group 2 [ 72 112  22  32  12]
Group 3 [ 43  73  13 103  93]
Group 4 [114  64  94  74  54]
Group 5 [ 85  25 115  95  65]
Group 6 [76 26 36 86 66]
Group 7 [107   7 117  97  47]
Group 8 [ 8 58 18 38 78]
Group 9 [39 59 89 99 69]
Group 10 [ 70  80  40  10 110]
