### Introduction to Numpy

* What is this?
  * A python library having lots of tools for high end mathematical calculations & super fast matrix calculations

* Why should I care about this?
 * If not Numpy, you can use Python lists to do the same job but there is a catch!
 * Numpy inherently being used in majority of higher libraries to make the math look easy
 * Numpy is core towards implementation of Computer Vision and NLP algorithms which are central to AI revolution
 * Majority machine learning algorithms need numpy for any matrix/linear algebra implementation
* What Numpy has to offer?
 * Capabilities
  * Provides tools to perform high end mathematical operations
   * Linear Algebra
   * Vector Operations
   * Trigonometric Functions
   * Statistical Distributions
 * Properties
   * Super Fast Speed (C/C++ based implementation)
   * Homogeneous data
   * Contiguous memory allocation
   * Mutable (Can be modified after being created

## What are we going to learn today?
* How to import?
* Create our first numpy array
* Demo of a mathematical operation on python list vs numpy array : To demonstrate how easy is to perform such operations in      Numpy
* Creating a 1D,2D,3D numpy array
* How to find an element in a numpy array (Indexing)
* Quick data generation in numpy
* Mathematical & Logical operations in numpy arrays
* Data access in numpy array / Slicing and Dicing in a numpy array
* What are shape,size,no.of dimensions,axes in a numpy array?
* What is reshaping?
* Array concatenation in numpy
* Demo of performance differences of python list vs numpy array

In [2]:
# the first line of code to write to call a library or package which can perform Numpy tasks
import numpy as np

In [3]:
# let's create a python list of temperature values in Celsius
cel_values = [22.3,23.5,24.6,26.1,27.7,29.4]

In [4]:
# let's see what is the data type of this
type(cel_values)

list

In [5]:
# let's convert this to a numpy ndarray
np_cel = np.array(cel_values)

In [6]:
# let's see its data type
type(np_cel)

numpy.ndarray

In [7]:
# let's perform a conversion function on the python list created before
cal_fahr = [x*9/5 + 32 for x in cel_values ]

In [10]:
# let's see the result which is again a python list
cal_fahr

[72.14, 74.3, 76.28, 78.98, 81.86, 84.91999999999999]

In [13]:
# let's check the data type
type(cal_fahr)

list

In [11]:
# this is possible because of the contiguous and homogeneous data storage made possible by numpy nd arrays
cal_fahr1 = np_cel*9/5 + 32

In [12]:
cal_fahr1

array([72.14, 74.3 , 76.28, 78.98, 81.86, 84.92])

In [14]:
# let's check the data type again
type(cal_fahr1)

numpy.ndarray

## Some common Numpy operations
* Creating numpy arrays - 1D
* Creating numpy arrays - more than one dimensions
* How to find index of any element in an ND array (with examples of 1D, 2D, 3D array
* Creating numpy arrays (similar to matrics)

In [22]:
# create a one dimensional numpy array
# every additional dimension is called as "axis"
arr1 = np.array([1,2,3,4])

In [23]:
arr1

array([1, 2, 3, 4])

In [44]:
# how to access a given value
# please note that arrays in numpy start with an index 0
print('The first 3 values of the array are :',arr1[0], arr1[1],arr1[2])
arr

The first 3 values of the array are : 1 2 3


In [189]:
# create a two dimensional array
arr2 = np.array([[1,2],[3,4],[5,6]])

In [193]:
arr2

array([[1, 2],
       [3, 4],
       [5, 6]])

In [196]:
# how to access the values of a 2D numpy array
print('The 1st 3 values:',arr2[0][0],arr2[0][1],arr2[1][0])

# what is the error here?
# int('The 1st 3 values:',arr2[0][0],arr2[0][1],arr2[1][2])

The 1st 3 values: 1 2 3


In [40]:
# create a 3 dimensional array
arr3 = np.array([[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]]])

In [41]:
arr3

array([[[ 1,  2,  3],
        [ 4,  5,  6],
        [ 7,  8,  9]],

       [[10, 11, 12],
        [13, 14, 15],
        [16, 17, 18]]])

In [60]:
# how to access the values of a 3D numpy array
arr3[0][0][0], arr3[0][0][1], arr3[0][1][2]

# Quick Check of Understanding
# write down the expression to fetch the value of 14, 17, 9


8

## Numpy functions for quick data generation
* Generating random numbers
* Creating a data range
* Creating matrics of zeros, ones quickly
* Creating matrics of desired dimension quickly

In [74]:
#create a 1d numpy array of 10 elements
# syntax : np.zeros(shape, dtype=float, order='C')
# shape means whether the output array is a 1d, 2d,....nd array and specifies the size ultimately
# if a 2d array then shape can be (1,1) or (2,3) or (3,5)
# if a 3d array then shape can be (1,2,3) or (4,5,4) or like a cube (4,4,4)
arr1 = np.zeros(10)
arr1

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [82]:
# did you notice the . next to every zero value above? this is because the value stored is of type float by default
# lets check the same
type(arr1[3])
# define it explicitly
arr1 = np.zeros(10,dtype =np.int)
arr1
type(arr1[4])

array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]])

In [52]:
arr2 = np.zeros((2,4))
arr2

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [4]:
arr3 = np.zeros((3,3,3),dtype=int)
arr3

array([[[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]],

       [[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]]])

In [34]:
# what we did with zeros can be done with ones too
arr4 = np.ones((3,4),dtype=int)
arr4

array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])

In [38]:
#creating an identity matrix
arr5 = np.eye(3)
arr5

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [198]:
# how to generate arrays of a specific value passed on by the user?
arr1 = np.full((3,3),4)
arr1

array([[4.2, 4.2, 4.2],
       [4.2, 4.2, 4.2],
       [4.2, 4.2, 4.2]])

In [5]:
arr2 = np.full((3,3),4.2,dtype=float)
arr2
# syntax options : np.full(shape, fill_value, dtype=None, order='C')

array([[4.2, 4.2, 4.2],
       [4.2, 4.2, 4.2],
       [4.2, 4.2, 4.2]])

# Few more important Functions
* Creating an equally spaced out series of values
* Creating an equally spaced out series in between 2 values
* Creating randomly generated values of a given shape

In [206]:
arr1 = np.arange(10,100,5)
arr1
# the syntax(default) : np.arange(start,stop,step,dtype) -- default start = 0, step = 1

array([0.98328468, 0.3333835 , 0.02165345, 0.86309276, 0.7801231 ])

In [53]:
arr2 = np.linspace(start=0,stop=10,num=25,retstep=False)
arr2
# the syntax(default) : np.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None)

array([ 0.        ,  0.41666667,  0.83333333,  1.25      ,  1.66666667,
        2.08333333,  2.5       ,  2.91666667,  3.33333333,  3.75      ,
        4.16666667,  4.58333333,  5.        ,  5.41666667,  5.83333333,
        6.25      ,  6.66666667,  7.08333333,  7.5       ,  7.91666667,
        8.33333333,  8.75      ,  9.16666667,  9.58333333, 10.        ])

In [7]:
# lets create some random number sequences between 0 and 1
arr3 = np.random.rand(10)
arr3

array([0.24753664, 0.01827422, 0.51974085, 0.90100773, 0.76402263,
       0.4501163 , 0.0093397 , 0.84169792, 0.66585071, 0.003418  ])

In [8]:
# lets create some random number sequences of a given shape
arr3 = np.random.rand(3,3)
arr3

array([[0.28339429, 0.27895369, 0.87238637],
       [0.44571937, 0.65836003, 0.27915948],
       [0.41868629, 0.28164999, 0.9059725 ]])

In [9]:
# random number sequence from normal distribution
arr4 = np.random.randn(3,3)
arr4

array([[ 0.5992737 ,  0.08314971,  0.39315471],
       [-0.56111811,  0.11757904,  1.34930414],
       [ 1.67050535, -1.39319112, -0.48922618]])

In [59]:
arr4 = np.random.normal(0,1,(3,4))
arr4

array([[ 0.4476231 ,  0.71561167,  1.71879612,  0.58489125],
       [-0.71196646, -1.27386349,  0.35268059, -0.47780327],
       [-0.80096629, -0.80793971,  1.18084751,  1.19748586]])

In [60]:
# random number sequence within limits and size
arr5 = np.random.randint(1,10,(2,3))
arr5

array([[5, 3, 7],
       [7, 1, 1]])

In [11]:
arr6 = np.random.random_sample(5)
arr6

array([0.25905583, 0.91591015, 0.17378153, 0.06462677, 0.83138494])

## Some mathematical & logical functions on Numpy
* Addition,Subtraction,Multiplication,exponential
* Logical & Comparison
  * And/Or
  * Matrix And/Or

In [87]:
arr1 = np.arange(1,50,10)
arr1

array([ 1, 11, 21, 31, 41])

In [88]:
arr1 + 1

array([ 2, 12, 22, 32, 42])

In [93]:
arr2 = np.arange(1,10,3)
arr2

array([1, 4, 7])

In [94]:
arr3 = 2**arr2
arr3

array([  2,  16, 128], dtype=int32)

In [96]:
arr3 - arr2

array([  1,  12, 121])

In [97]:
arr1 = np.random.randint(1,5,(3,3))
arr1

array([[2, 3, 2],
       [4, 4, 4],
       [4, 2, 4]])

In [98]:
arr2= np.random.randint(5,10,(3,3))
arr2

array([[5, 5, 6],
       [6, 7, 5],
       [5, 8, 8]])

In [99]:
arr3 = arr1 + arr2
arr3

array([[ 7,  8,  8],
       [10, 11,  9],
       [ 9, 10, 12]])

In [101]:
arr4 = arr1 * arr2
arr4

array([[10, 15, 12],
       [24, 28, 20],
       [20, 16, 32]])

In [102]:
arr5 = arr1.dot(arr2)
arr5

array([[38, 47, 43],
       [64, 80, 76],
       [52, 66, 66]])

In [104]:
arr1 = np.array([1,2,3,4])
arr1

array([1, 2, 3, 4])

In [105]:
arr2 = np.array([2,2,3,5])
arr2

array([2, 2, 3, 5])

In [106]:
#element wise comparison
arr1 == arr2

array([False,  True,  True, False])

In [109]:
arr1 < arr2

array([ True, False, False,  True])

In [110]:
# full array comparison
np.array_equal(arr1,arr2)

False

In [111]:
#logical or
arr1 = np.array([1,1,1,0])
arr2 = np.array([1,0,1,0])
arr3 = np.logical_or(arr1,arr2)
arr3

array([ True,  True,  True, False])

In [112]:
#logical and
arr1 = np.array([1,1,1,0])
arr2 = np.array([1,0,1,0])
arr3 = np.logical_and(arr1,arr2)
arr3

array([ True, False,  True, False])

In [116]:
# few transcendental functions
arr1 = np.arange(5)
np.sin(arr1)
np.cos(arr1)
np.exp(arr1)

array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692, 54.59815003])

## Quick access to numpy array elements

In [41]:
# lets create a numpy array
arr1 = np.random.randint(1,3,(5,5))
arr1

array([[2, 1, 1, 1, 1],
       [2, 1, 1, 2, 2],
       [2, 2, 2, 1, 2],
       [1, 1, 1, 2, 1],
       [2, 2, 1, 2, 2]])

In [42]:
# access everything
arr1[0:]


array([[2, 1, 1, 1, 1],
       [2, 1, 1, 2, 2],
       [2, 2, 2, 1, 2],
       [1, 1, 1, 2, 1],
       [2, 2, 1, 2, 2]])

In [44]:
# access from 3rd row
arr1[2:]

array([[2, 2, 2, 1, 2],
       [1, 1, 1, 2, 1],
       [2, 2, 1, 2, 2]])

In [45]:
# access from 3rd row and beyond but only alternate rows
arr1[2::2]

array([[2, 2, 2, 1, 2],
       [2, 2, 1, 2, 2]])

In [16]:
# access till 3rd row - the ending index is excluded
arr1[:2]

array([[1, 2, 1, 2, 2],
       [1, 1, 1, 1, 2]])

In [17]:
# access from 2nd to 5th row
arr1[1:4]

array([[1, 1, 1, 1, 2],
       [1, 1, 2, 1, 2],
       [1, 1, 1, 1, 2]])

In [18]:
# what will happen if I use negative values in the index?
arr1[-3:-1]

array([[1, 1, 2, 1, 2],
       [1, 1, 1, 1, 2]])

In [48]:
# print the output of arr1[:-2], arr1[-3:]
# column access
arr1[:,1::2]

array([[1, 1],
       [1, 2],
       [2, 1],
       [1, 2],
       [2, 2]])

In [21]:
arr1[:,:2]

array([[1, 2],
       [1, 1],
       [1, 1],
       [1, 1],
       [2, 2]])

In [22]:
arr1[:,-2:]

array([[2, 2],
       [1, 2],
       [1, 2],
       [1, 2],
       [2, 1]])

In [23]:
arr1[:,:-3]

array([[1, 2],
       [1, 1],
       [1, 1],
       [1, 1],
       [2, 2]])

In [24]:
arr1[:,2:3]

array([[1],
       [1],
       [2],
       [1],
       [1]])

In [None]:
# both row and column access
arr1[1:3,1:3]

## Understanding shapes and reshaping in Numpy
* Showing why numpy is called mutable

In [25]:
# lets create a 3X4 numpy array between 5 and 8
arr1 = np.random.randint(5,8,(3,4))
#print
arr1

array([[7, 7, 7, 6],
       [7, 6, 6, 6],
       [5, 5, 5, 7]])

In [26]:
#print the shape
arr1.shape

(3, 4)

In [27]:
#print the size
arr1.size

12

In [28]:
#print the no. of axes/dimensions involved
arr1.ndim

2

In [29]:
arr2 = arr1.reshape(2,6)
arr2

array([[7, 7, 7, 6, 7, 6],
       [6, 6, 5, 5, 5, 7]])

In [30]:
#arr2.ndim
arr3 = arr1.reshape(12)
arr3

array([7, 7, 7, 6, 7, 6, 6, 6, 5, 5, 5, 7])

In [31]:
#arr3.ndim

arr4 = np.arange(1,100,10).reshape(-1,1)
arr4

# does this expression have any error?
# arr5 = np.arange(1,100,10).reshape(3,4)

array([[ 1],
       [11],
       [21],
       [31],
       [41],
       [51],
       [61],
       [71],
       [81],
       [91]])

## Understanding array concatenation in Numpy

In [76]:
arr1 = np.random.randint(1,10,(2,5))
arr1

array([[8, 7, 1, 7, 3],
       [3, 9, 2, 8, 4]])

In [79]:
arr2 = np.random.randint(20,30,(2,5))
arr2

array([[22, 25, 26, 28, 27],
       [22, 25, 22, 24, 25]])

In [80]:
arr3 = np.concatenate([arr1,arr2])
arr3

array([[ 8,  7,  1,  7,  3],
       [ 3,  9,  2,  8,  4],
       [22, 25, 26, 28, 27],
       [22, 25, 22, 24, 25]])

In [84]:
arr3 = np.concatenate([arr1,arr2],axis=1)
arr3

array([[ 8,  7,  1,  7,  3, 22, 25, 26, 28, 27],
       [ 3,  9,  2,  8,  4, 22, 25, 22, 24, 25]])

## Speed test Demo - Numpy
* The most important benefits of using it are :
 * It consumes less memory.
 * It is fast as compared to the python List.
 * It is convenient to use.

In [33]:
import time as t
#Let's define size of the list/numpy array
size = 100000
#let's create a python list of the size 100000
py_list = range(size)
start = time.time()
new_py_list = [x*9/5 + 32 for x in py_list]
print('The time taken for the above operation to execute (in msec) : ', (time.time()-start)*1000)

Time for Python List in msec:  16.956090927124023
The time taken for the above operation to execute (in msec) :  17.953872680664062


In [34]:
import time as t
#Let's define size of the list/numpy array
size = 100000
#let's create a numpy array of the size 100000
py_np = np.arange(size)
start = time.time()
new_py_np = py_np*9/5+32
print('The time taken for the above operation to execute (in msec) : ', (time.time()-start)*1000)

The time taken for the above operation to execute (in msec) :  5.981922149658203


In [61]:
# The above two chunks of code proves the fact that numpy is much faster to process compared to python lists
# In the above case, numpy operation is 3 times faster than python list operations

## Numpy Learning References
* https://docs.scipy.org/doc/numpy/reference/
* 

## What to expect in the next Class?
* More into splitting, stacking functions
  * split, hsplit, vsplit
  * hstack, vstack
* Broadcasting
* Understanding vectors
* Utility Functions
* Common Numpy Utilities