# Numpy

Is a library for work with multidimensional arrays, in addition it provides many usefull functions for stats and linear algerba calculations. It's well implemented in C so is very fast and memeory effiecnt.

In [1]:
import numpy as np

In [2]:
a = np.array([0, 1, 2, 3])
a

array([0, 1, 2, 3])

# Speed

Since where calling C code numpy is orders of magintudes faster than python

In [3]:
L = range(1000)
%timeit [i**2 for i in L]

189 µs ± 5.24 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


In [4]:
a = np.arange(1000)
%timeit a**2

848 ns ± 12.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)


# Creating Arrays

Arrays can be easily created from python list.

In [69]:
a = np.array([0, 1, 2, 3])
print('The array', a)
print('Number of dims ', a.ndim)
print('Array shape', a.shape)

The array [0 1 2 3]
Number of dims  1
Array shape (4,)


In [70]:
b = np.array([[0, 1, 2], [3, 4, 5]])    # 2 x 3 array
print('The array \n',b)
print('Number of dims' ,b.ndim)
print('Array shape', b.shape)

The array 
 [[0 1 2]
 [3 4 5]]
Number of dims 2
Array shape (2, 3)


# Functions for creating arrays

Numpy also provies many useful functions for creating arrays.

In [20]:
np.arange(0,10,2) #start, end, step

array([0, 2, 4, 6, 8])

In [18]:
np.linspace(0,1,5) #star, end, #num points

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [21]:
np.ones((5,5))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [23]:
np.random.rand(5,5)

array([[0.54784054, 0.71362219, 0.99205276, 0.89034868, 0.36573188],
       [0.63582348, 0.74918637, 0.78885567, 0.18737997, 0.8771148 ],
       [0.03046825, 0.23277947, 0.02182427, 0.60268252, 0.5714999 ],
       [0.19813913, 0.16644754, 0.07920804, 0.44633629, 0.07952829],
       [0.89596503, 0.9182767 , 0.3465949 , 0.60332155, 0.35951793]])

# Bulit in functions

The array objects have many useful functions built in

In [89]:
a = np.array([1,2,3,4,5])
print(a.mean())
print(a.min())
print(a.max())

3.0
1
5


We can also calculate the functions across specific axis.

In [112]:
a = np.arange(1,13).reshape(3,4)
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [113]:
a.mean(axis=0) # take mean across columns

array([5., 6., 7., 8.])

In [114]:
a.mean(axis=1) # take mean across rows

array([ 2.5,  6.5, 10.5])

# Data Types



In normal python we don't have to worry about data types for numbers. However in numpy it is a consideration. By having control over which datatype we use, it allows us to write faster and memory efficent code.


In [97]:
%%html
<img src="http://images.slideplayer.com/24/7556788/slides/slide_13.jpg" align="middle"  width="800"> 

In [104]:
a = np.random.rand(1,4)
a

array([[0.39827891, 0.93498803, 0.83439139, 0.34571397]])

In [105]:
print('Dtype: ', a.dtype)
print('Size of datatype in bytes: ', a.itemsize)
print('Size in bytes: ', a.nbytes)

Dtype:  float64
Size of datatype in bytes:  8
Size in bytes:  32


In [107]:
a = a.astype('float32')
print('Dtype: ', a.dtype)
print('Size of datatype in bytes: ', a.itemsize)
print('Size in bytes: ', a.nbytes)

Dtype:  float32
Size of datatype in bytes:  4
Size in bytes:  16


# Indexing

Indexing should feel familair to python string indexing, since it obeys many of the same rules.

In [35]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [41]:
a[-1] # last element

9

In [40]:
a[1:10:3] # start, stop ,step

array([1, 4, 7])

Indexing riles for higher dimensional array follows the same rules. But each dimension is seperated by a comma

In [115]:
a = np.arange(1,13).reshape(3,4)
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [116]:
a[1,1:3] #  row, column

array([6, 7])

In [42]:
%%html
<img src="http://www.scipy-lectures.org/_images/numpy_indexing.png" align="middle"  width="600"> 


Just like pandas we can use booleans for indexing

In [54]:
a = np.arange(50)
a

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49])

In [58]:
a[a % 3 == 0]

array([ 0,  3,  6,  9, 12, 15, 18, 21, 24, 27, 30, 33, 36, 39, 42, 45, 48])

# Views 

The slicing copy creates a view of the original array. So modifying a slice will modify the original. Can be confusing at first but allow us to save memory and time.

In [47]:
a = np.ones(10)
a

array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

In [49]:
b = a[1:4]
b

array([1., 1., 1.])

In [50]:
b[1] = 5

In [51]:
a

array([1., 1., 5., 1., 1., 1., 1., 1., 1., 1.])

We can check if two element share memory (or are views)

In [52]:
np.may_share_memory(a,b)

True

# Element Wise Ops

Unlike a normal python list we don't need to use a for loop to add element wise to a list.

In [59]:
a = np.array([1,2,3,4])
a

array([1, 2, 3, 4])

In [60]:
a + 3

array([4, 5, 6, 7])

In [62]:
a * 10

array([10, 20, 30, 40])

In [64]:
X  = np.ones((5,5))
X

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [65]:
X + X

array([[2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.],
       [2., 2., 2., 2., 2.]])

# Broadcasting

Broadcasting describes how numpy treats arrays with different shapes during arithmetic operations, this is easest to understand with a picture.

In [73]:
%%html
<img src="https://www.tutorialspoint.com/numpy/images/array.jpg" align="middle"  width="600"> 

In [109]:
a = np.arange(0,40,10).reshape(4,1)
a = np.tile(a,3)
a

array([[ 0,  0,  0],
       [10, 10, 10],
       [20, 20, 20],
       [30, 30, 30]])

In [110]:
b = np.array([0,1,2])
b

array([0, 1, 2])

In [111]:
a + b

array([[ 0,  1,  2],
       [10, 11, 12],
       [20, 21, 22],
       [30, 31, 32]])

# Further Reading

[Numpy intro](http://www.scipy-lectures.org/intro/numpy/index.html)