# Introduction to NumPy

#### Mentor: Dr. Prashant Singh Rana <br><br> Author: Aaryaman Singla  <br><br> Topic: Introduction to NumPy

### What is NumPy?

NumPy is a Python library. NumPy stands for Numerical Python and it provides a multidimensional array object, various derived objects and is helpful for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation, etc.

### Why study NumPy if we have Python lists?

There are numerous advantages of using NumPy over Python list:<br><br> 
- Size or Memory - NumPy data structures take up less space!<br>
- Performance - NumPy Arrays are faster than Python lists<br>
- Functionality - NumPy have optimized functions such as linear algebra operations built in.

### What will we cover in this tutorial?

- Installation of NumPy
- Most Useful Function
- Creating Arrays
- Manipulating and comparing Arrays
- Reshaping and Transposing
- Sorting Arrays
- Turning an IMAGE into NumPy Array

## Let's get started :)

### 0. Installation of Numpy

To install NumPy with the package manager for `Python`, run: `pip install numpy` <br>
To install NumPy with the package manager for `Conda`, run: `conda install numpy`

Once the package is installed, run:

In [1]:
# To use the functions of a library, we need to import it
# np is short for numpy which acts as an alias

import numpy as np

### 1. Some Important and Useful Functions

#### DataType and Attributes
NumPy's main datatype is ndarray (stands for N-Dimensional Array)

In [2]:
# This means to access the "array" function from numpy 
# and then pass a list of numbers

A1 = np.array([1,2,3,4])
A1

array([1, 2, 3, 4])

In [3]:
# Gives the datatype of A1

type(A1)

numpy.ndarray

In [4]:
# Gives the shape of A1

A1.shape

(4,)

In [5]:
A2 = np.array([[2,4,6],
             [1,3,5]])
A2

array([[2, 4, 6],
       [1, 3, 5]])

In [6]:
type(A2)

numpy.ndarray

In [7]:
A2.shape

(2, 3)

In [8]:
A3 = np.array([[[1,2,3,4],
               [5,6,7,8]],
              [[9,10,11,12],
              [13,14,15,16]],
              [[17,18,19,20],
              [21,22,23,24.0]]])
A3

array([[[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.]],

       [[ 9., 10., 11., 12.],
        [13., 14., 15., 16.]],

       [[17., 18., 19., 20.],
        [21., 22., 23., 24.]]])

In [9]:
type(A3)

numpy.ndarray

In [10]:
A3.shape

(3, 2, 4)

A simple list has rank 1<br>
A 2-dimensional array (sometimes called a matrix) has rank 2<br>
A 3-dimensional array has rank 3. It is shown here as a stack of matrices<br><br>
It gets difficult to draw arrays with more than 3 dimensions, but numpy allows you to have as many dimensions as you want.

The rank 1 array above has 4 elements, so its shape is the tuple (4,).<br>
The rank 2 array has 2 rows and 3 columns, so its shapes is (2, 3). By convention, we take the matrix shape to be rows first, then columns.<br>
The rank 3 array has 3 planes, each containing 2 rows and 4 columns, so its shapes is (3, 2, 4). By convention, we take the matrix shape to be planes first, then rows, then columns.<br>

<img src="Anatomy.png" width = 80%/>

In [11]:
# "ndim" attribute gives the number of dimensions of the array

print("Dimesion of the array A1 is:", A1.ndim)
print("Dimesion of the array A2 is:", A2.ndim)
print("Dimesion of the array A3 is:", A3.ndim)

Dimesion of the array A1 is: 1
Dimesion of the array A2 is: 2
Dimesion of the array A3 is: 3


In [12]:
# "size" attribute gives the total number of elements in the array

print("Size of the array A1 is:", A1.size)
print("Size of the array A2 is:", A2.size)
print("Size of the array A3 is:", A3.size)

Size of the array A1 is: 4
Size of the array A2 is: 6
Size of the array A3 is: 24


In [13]:
# "dtype" attritube gives the datatype of elements of array

print("Datatype of elements of the array A1 is:", A1.dtype)
print("Datatype of elements of the array A2 is:", A2.dtype)
print("Datatype of elements of the array A3 is:", A3.dtype)

Datatype of elements of the array A1 is: int32
Datatype of elements of the array A2 is: int32
Datatype of elements of the array A3 is: float64


### 2. Creating Arrays

In [14]:
Array = np.array([1,2,3])
Array

array([1, 2, 3])

#### 2.1 Ones and Zeros

In [15]:
# "ones" returns a new array of given shape and type FILLED WITH ONES

OnesArray = np.ones((2,3))
OnesArray

array([[1., 1., 1.],
       [1., 1., 1.]])

In [16]:
# default type is float64

OnesArray.dtype

dtype('float64')

In [17]:
# "Zeros" returns a new array of given shape and type FILLED WITH Zeros

ZerosArray = np.zeros((2,3))
ZerosArray

array([[0., 0., 0.],
       [0., 0., 0.]])

In [18]:
# default type is float64

ZerosArray.dtype

dtype('float64')

In [19]:
# arange return evenly spaced values within a given interval
# Syntax: np.arange(start,stop,step)

range_array = np.arange(0,20,2)
range_array

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

#### 2.2 Randint

In [20]:
#randint return integers from 'low' (inclusive) to 'high' (exclusive)
# Syntax: np.random.randint(low, high, size)

random_array = np.random.randint(0, 5,  size=(2,3,4))
random_array

array([[[2, 3, 1, 2],
        [3, 0, 2, 1],
        [4, 2, 2, 4]],

       [[4, 1, 3, 3],
        [3, 1, 4, 1],
        [3, 2, 0, 3]]])

#### 2.3 Random

In [21]:
# np.random.random return random floats in the half-open interval [0.0,1.0)
# Syntax: np.random.random(size)

random_array2 = np.random.random((4,5))
random_array2

array([[0.99749644, 0.33396992, 0.77497742, 0.04811149, 0.49358697],
       [0.59832888, 0.83301766, 0.06626059, 0.57386991, 0.16177398],
       [0.84074394, 0.78717432, 0.29472523, 0.83744396, 0.5316787 ],
       [0.19916427, 0.64723411, 0.24877681, 0.52489074, 0.20867676]])

#### 2.4 Rand

In [22]:
# np.random.rand returns random values in a given shape
# Syntax: np.random.rand(size)

random_array3 = np.random.rand(2,3)
random_array3

array([[0.25773044, 0.35148533, 0.75642245],
       [0.9821571 , 0.84462542, 0.88886284]])

#### 2.5 Random Seed
`np.random.seed(0)` makes the random numbers predictable.<br>
With the seed reset (every time), the same set of numbers will appear every time.

In [23]:
np.random.seed(0)
random_array4 = np.random.randint(10, size=(2,3))
random_array4

array([[5, 0, 3],
       [3, 7, 9]])

### 3. Manipulating and Comparing Arrays

#### 3.1 Arithmetic Operation

In [24]:
# Previously created array A2
A2

array([[2, 4, 6],
       [1, 3, 5]])

In [25]:
# Previously created random array with seed
random_array4

array([[5, 0, 3],
       [3, 7, 9]])

#### 3.1.1 Addition of Arrays

In [26]:
addition = A2 + random_array4
addition

array([[ 7,  4,  9],
       [ 4, 10, 14]])

In [27]:
additon2 = np.add(A2,random_array4)
additon2

array([[ 7,  4,  9],
       [ 4, 10, 14]])

#### 3.1.2 Subtraction of Arrays

In [28]:
subtraction = A2 - random_array4
subtraction

array([[-3,  4,  3],
       [-2, -4, -4]])

#### 3.1.3 Multiplication of Arrays

In [29]:
multi = A2 * random_array4
multi 

array([[10,  0, 18],
       [ 3, 21, 45]])

#### 3.1.4 Divison of arrays

In [30]:
div = (random_array4)/A2
div

array([[2.5       , 0.        , 0.5       ],
       [3.        , 2.33333333, 1.8       ]])

#### 3.1.5 Squaring the elements of array

In [31]:
squ = A2 **2
squ

array([[ 4, 16, 36],
       [ 1,  9, 25]], dtype=int32)

In [32]:
squ2 = np.square(A2)
squ2

array([[ 4, 16, 36],
       [ 1,  9, 25]], dtype=int32)

#### 3.1.6 Exponential of each element in the array

In [33]:
exp = np.exp(A2)
exp

array([[  7.3890561 ,  54.59815003, 403.42879349],
       [  2.71828183,  20.08553692, 148.4131591 ]])

#### 3.1.7 Logarithm of Array

In [34]:
log = np.log(A2)
log

array([[0.69314718, 1.38629436, 1.79175947],
       [0.        , 1.09861229, 1.60943791]])

#### 3.1.8 Modulus of Array

In [35]:
mod = A2 % 3
mod

array([[2, 1, 0],
       [1, 0, 2]], dtype=int32)

#### 3.1.9 Maximum element in an Array

In [36]:
maxi = np.max(A2)
maxi

6

#### 3.1.10 Minimum element of an Array

In [37]:
mini = np.min(A2)
mini

1

#### 3.1.11 Mean of an Array

In [38]:
mean = np.mean(A2)
mean

3.5

#### 3.1.12 Standard Deviation
The standard deviation is a statistic that measures the dispersion of a dataset relative to its mean

<img src = "Std.PNG" width = 35%>

In [39]:
Std = np.std(A2)
Std

1.707825127659933

#### 3.1.13 Variance 
The average of the squared differences from the Mean.

In [40]:
Var = np.var(A2)
Var

2.9166666666666665

#### 3.2 Comparing Arrays

In [41]:
# Previously created arrays
random_array3

array([[0.25773044, 0.35148533, 0.75642245],
       [0.9821571 , 0.84462542, 0.88886284]])

In [42]:
# Previously created arrays
A2

array([[2, 4, 6],
       [1, 3, 5]])

In [43]:
A2 > random_array3

array([[ True,  True,  True],
       [ True,  True,  True]])

In [44]:
random_array3 < 0.5

array([[ True,  True, False],
       [False, False, False]])

In [45]:
A2>5

array([[False, False,  True],
       [False, False, False]])

### 4. Reshaping and Transposing an Array

In [46]:
# Previously created arrays
A2

array([[2, 4, 6],
       [1, 3, 5]])

In [47]:
# Previously created arrays
A3

array([[[ 1.,  2.,  3.,  4.],
        [ 5.,  6.,  7.,  8.]],

       [[ 9., 10., 11., 12.],
        [13., 14., 15., 16.]],

       [[17., 18., 19., 20.],
        [21., 22., 23., 24.]]])

In [48]:
A2 + A3

ValueError: operands could not be broadcast together with shapes (2,3) (3,2,4) 

<strong> An error is raised: </strong> `operands could not be broadcast together`<br>
The term `broadcasting` describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is “broadcast” across the larger array so that they have compatible shapes.

To perform operations on two arrays of different shapes, they must be `RESHAPED`

#### General Broadcasting Rules

When operating on two arrays, NumPy compares their shapes element-wise. It starts with the trailing dimensions and works its way forward. Two dimensions are compatible when

- they are equal, or

- one of them is 1

#### 4.1 Reshaping an Array

In [49]:
# reshape returns an array containing the same data with a new shape
#Syntax: array_name.reshape(new_shape)

reshaped_A2 = A2.reshape(3,2,1)
reshaped_A2

array([[[2],
        [4]],

       [[6],
        [1]],

       [[3],
        [5]]])

In [50]:
reshaped_A2.shape

(3, 2, 1)

In [51]:
A3.shape

(3, 2, 4)

In [52]:
reshaped_A2 * A3

array([[[  2.,   4.,   6.,   8.],
        [ 20.,  24.,  28.,  32.]],

       [[ 54.,  60.,  66.,  72.],
        [ 13.,  14.,  15.,  16.]],

       [[ 51.,  54.,  57.,  60.],
        [105., 110., 115., 120.]]])

In [53]:
reshaped_A2 + A3

array([[[ 3.,  4.,  5.,  6.],
        [ 9., 10., 11., 12.]],

       [[15., 16., 17., 18.],
        [14., 15., 16., 17.]],

       [[20., 21., 22., 23.],
        [26., 27., 28., 29.]]])

#### 4.2 Transposing an Array

In [54]:
A2

array([[2, 4, 6],
       [1, 3, 5]])

In [55]:
A2.shape

(2, 3)

In [56]:
# T stands for transpose which is formed by turning all the rows of a given matrix into columns and vice-versa

A2_trans = A2.T
A2_trans

array([[2, 1],
       [4, 3],
       [6, 5]])

In [57]:
A2_trans.shape

(3, 2)

### 5. Sorting Arrays

In [58]:
# Previously created arrays
random_array3

array([[0.25773044, 0.35148533, 0.75642245],
       [0.9821571 , 0.84462542, 0.88886284]])

In [59]:
# np.sort return a sorted copy of an array
# It will sort the number on each axis

np.sort(random_array3)

array([[0.25773044, 0.35148533, 0.75642245],
       [0.84462542, 0.88886284, 0.9821571 ]])

In [60]:
# np.argsort returns the indices that would sort an array

np.argsort(random_array3)

array([[0, 1, 2],
       [1, 2, 0]], dtype=int64)

### 6. Turning Image into NumPy Arrays

In [61]:
# Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python
# "imread" stands for image read

from matplotlib.image import imread
learn = imread("learningbydoing.jpg")

<img src= "learningbydoing.jpg">

In [62]:
type(learn)

numpy.ndarray

#### Since the type of an image is `"ndarray"`, it can be converted into arrays.
Images are type of unstructed data in machine learning and can be used to extract useful features

In [63]:
learn

array([[[107, 141, 168],
        [108, 142, 169],
        [108, 142, 169],
        ...,
        [124, 155, 176],
        [122, 153, 174],
        [120, 151, 172]],

       [[108, 142, 169],
        [108, 142, 169],
        [108, 142, 169],
        ...,
        [123, 154, 175],
        [121, 152, 173],
        [120, 151, 172]],

       [[108, 142, 169],
        [108, 142, 169],
        [108, 142, 169],
        ...,
        [123, 154, 175],
        [121, 152, 173],
        [120, 151, 172]],

       ...,

       [[ 71, 110, 139],
        [ 72, 111, 140],
        [ 73, 112, 141],
        ...,
        [  4,   4,   4],
        [  5,   5,   5],
        [  5,   5,   5]],

       [[ 74, 114, 140],
        [ 73, 112, 141],
        [ 73, 112, 143],
        ...,
        [  7,   7,   7],
        [  3,   3,   3],
        [ 10,  10,  10]],

       [[ 74, 114, 140],
        [ 73, 112, 141],
        [ 73, 112, 143],
        ...,
        [  7,   7,   7],
        [  3,   3,   3],
        [ 10,  10,  10]]

In [64]:
learn.shape

(274, 416, 3)

In [65]:
learn.size

341952

In [66]:
learn.ndim

3