# Numpy Tutorial

NumPy is the fundamental library for scientific computing with Python. NumPy is centered around a powerful N-dimensional array object, and it also contains useful linear algebra, Fourier transform, and random number functions.

In [1]:
from __future__ import division, print_function, unicode_literals

Importing Numpy

In [2]:
import numpy as np

### Basic numpy Functions

#### Vocabulary

* In NumPy, each dimension is called an **axis**.
* The number of axes is called the **rank**.
    * For example, the above 3x4 matrix is an array of rank 2 (it is 2-dimensional).
    * The first axis has length 3, the second has length 4.
* An array's list of axis lengths is called the **shape** of the array.
    * For example, the above matrix's shape is `(3, 4)`.
    * The rank is equal to the shape's length.
* The **size** of an array is the total number of elements, which is the product of all axis lengths (eg. 3*4=12)

In [3]:
y = np.ones((3,4))
print('Shape', y.shape)
print('Rank', y.ndim)  # equal to len(a.shape)
print('Size', y.size)

Shape (3, 4)
Rank 2
Size 12


In [4]:
type(y)

numpy.ndarray

#### np.full
Creates an array of the given shape initialized with the given value. Here's a 3x4 matrix full of `π`.

In [5]:
np.full((3,4), np.pi)

array([[3.14159265, 3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265, 3.14159265]])

#### Creating arrays

#### np.ones
Creates an array filles with ones
#### np.zeros
Creates an array filles with zeroes

In [6]:
print('Examples array', np.array([[1,2,3,4], [10, 20, 30, 40]]))
print('Empty array', np.empty((2,3)))
x = np.zeros(3)
y = np.zeros((3,4))
print('1D zeros', x)
print('2D zeros', y)
print()
y = np.ones((3,4))
print('2D ones', y)


Examples array [[ 1  2  3  4]
 [10 20 30 40]]
Empty array [[1.39913304e-316 0.00000000e+000 0.00000000e+000]
 [0.00000000e+000 0.00000000e+000 0.00000000e+000]]
1D zeros [0. 0. 0.]
2D zeros [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

2D ones [[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


#### `np.arange`
You can create an `ndarray` using NumPy's `range` function, which is similar to python's built-in `range` function:

In [7]:
print(np.arange(1, 5))
print(np.arange(1.0, 5.0))
print(np.arange(1.0, 5.0,0.5))

[1 2 3 4]
[1. 2. 3. 4.]
[1.  1.5 2.  2.5 3.  3.5 4.  4.5]


However, when dealing with floats, the exact number of elements in the array is not always predictible.

In [8]:
print(np.arange(0, 5/3, 0.333333333))
print(np.arange(0, 5/3, 0.333333334))

[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[0.         0.33333333 0.66666667 1.         1.33333334]


#### `np.linspace`
For this reason, it is generally preferable to use the `linspace` function instead of `arange` when working with floats. The `linspace` function returns an array containing a specific number of points evenly distributed between two values (note that the maximum value is *included*, contrary to `arange`):

In [9]:
print(np.linspace(0, 5/3, 6))

[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]


#### Random values
3x4 matrix initialized with random floats between 0 and 1 (uniform distribution):

In [10]:
np.random.rand(3,4)

array([[0.50455734, 0.32671071, 0.45904318, 0.82799272],
       [0.81496384, 0.27753759, 0.0837185 , 0.87082448],
       [0.79514065, 0.62602698, 0.03786832, 0.86368615]])

Here's a 3x4 matrix containing random floats sampled from a univariate [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) (Gaussian distribution) of mean 0 and variance 1:

In [11]:
np.random.randn(3,4)

array([[ 0.56733366,  0.89593942, -0.04366409, -0.66856701],
       [ 0.05191678, -0.65602992,  0.9410359 , -0.30406321],
       [ 0.69985026, -0.28342348, -1.67395005,  1.19895461]])

#### Reshape

In [12]:
g = np.arange(24)
print('Variable', g)
print('with shape', g.shape)
g2 = g.reshape(4,6)
print()
print('Reshaped variable')
print(g2)
print('with shape', g2.shape)

Variable [ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
with shape (24,)

Reshaped variable
[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]
with shape (4, 6)


In [13]:
g[4] = 1000
print(g)

[   0    1    2    3 1000    5    6    7    8    9   10   11   12   13
   14   15   16   17   18   19   20   21   22   23]


### `ravel`
The `ravel` function returns a new one-dimensional `ndarray` that also points to the same data:

In [14]:
print(g2.shape)
g3 = g2.ravel()
print(g3.shape)

(4, 6)
(24,)


### Arithmetic operations
All the usual arithmetic operators (`+`, `-`, `*`, `/`, `//`, `**`, etc.) can be used with `ndarray`s. They apply *elementwise*:

In [15]:
a = np.array([14, 23, 32, 41])
b = np.array([5,  4,  3,  2])
print("a + b  =", a + b)
print("a - b  =", a - b)
print("a * b  =", a * b) # Not matrix multiplication
print("a / b  =", a / b)
print("a // b  =", a // b)
print("a % b  =", a % b)
print("a ** b =", a ** b) 

a + b  = [19 27 35 43]
a - b  = [ 9 19 29 39]
a * b  = [70 92 96 82]
a / b  = [ 2.8         5.75       10.66666667 20.5       ]
a // b  = [ 2  5 10 20]
a % b  = [4 3 2 1]
a ** b = [537824 279841  32768   1681]


### Broadcasting

### Rule 1
*If the arrays do not have the same rank, then a 1 will be *prepended* to the smaller ranking arrays until their ranks match.*

In [16]:
h1 = np.arange(5).reshape(1, 5)
print('h1', h1)
print('h1_shape', h1.shape)
print()
y = np.array([10, 20, 30, 40, 50])
print('y', y)
print('y_shape',y.shape)
print()
h2 = h1 + y
print('h2', h2)  # same as: h + [[[10, 20, 30, 40, 50]]]
print('h2_shape', h2.shape)

h1 [[0 1 2 3 4]]
h1_shape (1, 5)

y [10 20 30 40 50]
y_shape (5,)

h2 [[10 21 32 43 54]]
h2_shape (1, 5)


### Second rule
*Arrays with a 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is repeated along that dimension.*

In [17]:
h1 = np.arange(6).reshape(3, 2)
print('h1')
print(h1)
print('h1_shape',h1.shape)
print()
y = np.array([[100, 200]])
print('y', y)
print('y_shape',y.shape)
print()
h2 = h1 + y  # same as: k + [[100, 100, 100], [200, 200, 200]]
print('h2')
print(h2)  # same as: h + [[[10, 20, 30, 40, 50]]]
print('h2_shape', h2.shape)

h1
[[0 1]
 [2 3]
 [4 5]]
h1_shape (3, 2)

y [[100 200]]
y_shape (1, 2)

h2
[[100 201]
 [102 203]
 [104 205]]
h2_shape (3, 2)


### Third rule
*After rules 1 & 2, the sizes of all arrays must match.*

In [18]:
k = np.zeros((2,3))
try:
    k + np.array([33, 44])
except ValueError as e:
    print(e)
    

operands could not be broadcast together with shapes (2,3) (2,) 


In [19]:
k = np.zeros((2,3))
try:
    k + np.array([33, 44]).reshape(2,1)
except ValueError as e:
    print(e)

### Conditional operators
The conditional operators also apply elementwise:

In [20]:
m = np.array([20, -5, 30, 40])
m < [15, 16, 35, 36]

array([False,  True,  True, False])

In [21]:
m < 25

array([ True,  True, False, False])

In [22]:
m[m < 25]

array([20, -5])

### Numpy Array methods

In [23]:
a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])

In [24]:
for func in (a.mean, a.min, a.max, a.sum, a.prod, a.std, a.var):
    print(func.__name__, "=", func())

mean = 6.766666666666667
min = -2.5
max = 12.0
sum = 40.6
prod = -71610.0
std = 5.084835843520964
var = 25.855555555555554


### Universal functions
NumPy also provides fast elementwise functions called *universal functions*, or **ufunc**. They are vectorized wrappers of simple functions. For example `square` returns a new `ndarray` which is a copy of the original `ndarray` except that each element is squared:

In [25]:
a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])

print("Original ndarray")
print(a)
for func in (np.square, np.abs, np.sqrt, np.exp, np.log, np.sign, np.ceil, np.modf, np.isnan, np.cos):
    print("\n", func.__name__)
    print(func(a))

Original ndarray
[[-2.5  3.1  7. ]
 [10.  11.  12. ]]

 square
[[  6.25   9.61  49.  ]
 [100.   121.   144.  ]]

 absolute
[[ 2.5  3.1  7. ]
 [10.  11.  12. ]]

 sqrt
[[       nan 1.76068169 2.64575131]
 [3.16227766 3.31662479 3.46410162]]

 exp
[[8.20849986e-02 2.21979513e+01 1.09663316e+03]
 [2.20264658e+04 5.98741417e+04 1.62754791e+05]]

 log
[[       nan 1.13140211 1.94591015]
 [2.30258509 2.39789527 2.48490665]]

 sign
[[-1.  1.  1.]
 [ 1.  1.  1.]]

 ceil
[[-2.  4.  7.]
 [10. 11. 12.]]

 modf
(array([[-0.5,  0.1,  0. ],
       [ 0. ,  0. ,  0. ]]), array([[-2.,  3.,  7.],
       [10., 11., 12.]]))

 isnan
[[False False False]
 [False False False]]

 cos
[[-0.80114362 -0.99913515  0.75390225]
 [-0.83907153  0.0044257   0.84385396]]


  import sys
  import sys


### Array indexing
### One-dimensional arrays
One-dimensional NumPy arrays can be accessed more or less like regular python arrays:

In [26]:
a = np.array([1, 5, 3, 19, 13, 7, 3])
print(a[3])
print(a[2:5]) # Note,  slices are actually views on the same data buffer
print(a[2:-1]) # So same srray is modified
print(a[:2])
print(a[2:2])
print(a[::])
print(a[::2])
print(a[::-1])


19
[ 3 19 13]
[ 3 19 13  7]
[1 5]
[]
[ 1  5  3 19 13  7  3]
[ 1  3 13  3]
[ 3  7 13 19  3  5  1]


### Array modification

In [27]:
a[2:5] = [997, 998, 999]
print(a)

a[2:5] = -1
print(a)


[  1   5 997 998 999   7   3]
[ 1  5 -1 -1 -1  7  3]


In [28]:
x = [1,4,5]
print(a[x])

[ 5 -1  7]


In [29]:
a = np.array([[1,2,3],[3,4,5]])
for x in a:
    for y in x:
        print(y)

1
2
3
3
4
5


In [30]:
for i in a.flat:
    print("Item:", i)

Item: 1
Item: 2
Item: 3
Item: 3
Item: 4
Item: 5


### Stacking

In [31]:
q1 = np.full((3,4), 1.0)
q2 = np.full((4,4), 2.0)
q3 = np.full((3,4), 3.0)
print('q1')
print(q1)
print()
print('q2')
print(q2)
print()
print('q3')
print(q3)
# Stack vertically
q4 = np.vstack((q1, q2, q3))
print()
print('q4')
print(q4)
print()
# Stack horizontally
q5 = np.hstack((q1, q3))
print('q5')
print(q5)
q6 = np.concatenate((q1, q2, q3), axis=0)  # Equivalent to vstack
print()
print('q6')
print(q6)


q1
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

q2
[[2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]]

q3
[[3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]]

q4
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]]

q5
[[1. 1. 1. 1. 3. 3. 3. 3.]
 [1. 1. 1. 1. 3. 3. 3. 3.]
 [1. 1. 1. 1. 3. 3. 3. 3.]]

q6
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]]


### Splitting

In [32]:
print('Before splitting')
print('q6',q6)
print()
q7 = np.vsplit(q6, 2)
print('After splitting')
print('q7',q7)

Before splitting
q6 [[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [2. 2. 2. 2.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]
 [3. 3. 3. 3.]]

After splitting
q7 [array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.]]), array([[2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.]])]


#### Multiplying

In [33]:
x = np.full((2,3),np.pi)
print(x.shape)
y = np.full((3,2),np.pi)

(2, 3)


In [34]:
z = x.dot(y)
print('Dot',z)
print(z.shape)

Dot [[29.6088132 29.6088132]
 [29.6088132 29.6088132]]
(2, 2)


In [35]:
print('Transpose',x.T)

Transpose [[3.14159265 3.14159265]
 [3.14159265 3.14159265]
 [3.14159265 3.14159265]]


In [36]:
import numpy.linalg as linalg
x = np.array([[1,2,3],[5,7,11],[21,29,31]])
xx = linalg.inv(x)
print('inv', xx)
yy = linalg.pinv(x)
print('pinv', yy)
ww = linalg.det(x) 
print('det', ww)
U, S_diag, V = linalg.svd(x)
print('U', U)
print('S_diag', S_diag)
print('V', V)

inv [[-2.31818182  0.56818182  0.02272727]
 [ 1.72727273 -0.72727273  0.09090909]
 [-0.04545455  0.29545455 -0.06818182]]
pinv [[-2.31818182  0.56818182  0.02272727]
 [ 1.72727273 -0.72727273  0.09090909]
 [-0.04545455  0.29545455 -0.06818182]]
det 43.99999999999997
U [[-0.07372121 -0.29088011 -0.95391506]
 [-0.27800655 -0.91260667  0.29976894]
 [-0.95774607  0.28729396 -0.01358811]]
S_diag [49.44315706  2.69540635  0.33015831]
V [[-0.43638843 -0.60409015 -0.66681349]
 [ 0.43750723  0.50512534 -0.74393267]
 [ 0.78622679 -0.61637933  0.04386296]]
