<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/1280px-NumPy_logo_2020.svg.png" />

<div class="alert alert-block alert-success">
    <h1 align="center" >Introduction to Data Analysis in Python</h1>
    <h3 align="center">Session 03: Numpy </h3>
    <h4 align="center"><a href="https://github.com/AtashfarazNavid/">My Github link</a></h5>
    <h4 align="center"><a href="https://www.linkedin.com/in/navidatashfaraz/">My LinkedIn link </a></h5>
 
</div>



---
# **Table of Contents**
1. **What is the NumPy Library in Python?** <br> 
2. **Install Numpy**
3. **Creating a NumPy Array** <br> 
4. **The Shape and Reshaping of NumPy Arrays** <br>
5. **Indexing and Slicing of NumPy array** <br>
6. **Stacking and Concatenating NumPy arrays** <br>
7. **Maths with NumPy arrays** <br>



---



# **1-** **What is the NumPy library in Python?**
**NumPy** stands for **Numerical Python** and is one of the most useful scientific libraries in Python programming. It provides support for large multidimensional array objects and various tools to work with them. Various other libraries like Pandas, Matplotlib, and Scikit-learn are built on top of this amazing library.

**Arrays** are a collection of elements/values, that can have one or more dimensions. An array of one dimension is called a Vector while having two dimensions is called a Matrix.


<img src = "https://predictivehacks.com/wp-content/uploads/2020/08/numpy_arrays-1024x572.png">


# **2- Install Numpy**
NumPy comes pre-installed when you download Anaconda. But if you want to install NumPy separately on your machine, just type the below command on your terminal:

In [None]:
!pip install numpy 



## **Now you need to import the library**

In [None]:
import numpy as np

In [None]:
np.__version__

'1.19.5'

np is the de facto abbreviation for NumPy used by the data science community.

# **3- Creating a NumPy Array**

## Basic ndarray
NumPy arrays are very easy to create given the complex problems they solve. To create a very basic ndarray, you use the **np.array()** method. All you have to pass are the values of the array as a list:

In [None]:
np.array([1,2,3,4])

array([1, 2, 3, 4])

This array contains integer values. You can specify the type of data in the dtype argument:

In [None]:
np.array([1,2,3,4],dtype=np.float32) 

array([1., 2., 3., 4.], dtype=float32)

NumPy arrays can be **multi-dimensional** too.

In [None]:
a= np.array([[1,2,3,4], 
             [5,6,7,8]])

In [None]:
a

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [None]:
a= np.array([[4,3], 
             [6,9]])
b= np.array([[-2,9], 
             [-5,2]])

c= a-b
c

array([[ 6, -6],
       [11,  7]])

In [None]:
x= np.array([[5], 
                [5],
                [2],
                [7]])
x

array([[5],
       [5],
       [2],
       [7]])

In [None]:
2*x

array([[10],
       [10],
       [ 4],
       [14]])

In [None]:
u= np.array([[3], 
            [-5],
            [4]])
v= np.array([[1], 
            [2],
            [5]]) 
u

array([[ 3],
       [-5],
       [ 4]])

In [None]:
c= np.transpose(u)
c

array([[ 3, -5,  4]])

In [None]:
c*v

array([[  3,  -5,   4],
       [  6, -10,   8],
       [ 15, -25,  20]])

In [None]:
u= np.array([3,-5,4]) 
v= np.array([[1], 
            [2],
            [5]]) 

In [None]:
u*v

array([[  3,  -5,   4],
       [  6, -10,   8],
       [ 15, -25,  20]])

Here, we created a **2-dimensional** array of values.

Note: A matrix is just a rectangular array of numbers with shape N x M where N is the number of rows and M is the number of columns in the matrix. The one you just saw above is a 2 x 4 matrix.

## Array of zeros
NumPy lets you create an array of all zeros using the **np.zeros()** method. All you have to do is pass the shape of the desired array:

In [None]:
np.zeros(5)

array([0., 0., 0., 0., 0.])

The one above is a 1-D array while the one below is a 2-D array:

In [None]:
np.zeros((2,3)) 

array([[0., 0., 0.],
       [0., 0., 0.]])

## Array of ones
You could also create an array of all 1s using the **np.ones()** method:

In [None]:
np.ones(5,dtype=np.int32)

array([1, 1, 1, 1, 1], dtype=int32)

## Random numbers in ndarrays
Another very commonly used method to create ndarrays is **np.random.rand()** method. It creates an array of a given shape with random values from [0,1):

In [None]:
#random 
np.random.rand(2,3)

array([[0.36067131, 0.71353006, 0.54564504],
       [0.99454446, 0.46396027, 0.01850135]])

## An array of your choice
Or, in fact, you can create an array filled with any given value using the **np.full()** method. Just pass in the shape of the desired array and the value you want:

In [None]:
np.full((2,2),7)

array([[7, 7],
       [7, 7]])

## Imatrix in NumPy
Another great method is **np.eye()** that returns an array with 1s along its diagonal and 0s everywhere else.

An Identity matrix is a square matrix that has 1s along its main diagonal and 0s everywhere else. Below is an Identity matrix of shape 3 x 3.

Note: A square matrix has an **N x N shape**. This means it has the **Same number of rows and columns**.

In [None]:
#identity matrix
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [None]:
np.eye(5)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

However, NumPy gives you the flexibility to change the diagonal along which the values have to be 1s. You can either move it above the main diagonal:

In [None]:
np.eye(3,k=1)

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

In [None]:
np.eye(3,k=-2)

array([[0., 0., 0.],
       [0., 0., 0.],
       [1., 0., 0.]])

## Evenly spaced ndarray
You can quickly get an evenly spaced array of numbers using the **np.arange()** method:

In [None]:
np.arange(5)

array([0, 1, 2, 3, 4])

The start, end and step size of the interval of values can be explicitly defined by passing in three numbers as arguments for these values respectively. A point to be noted here is that the interval is defined as **[start,end)** where the last number will not be included in the array:

In [None]:
np.arange(2,10,2)

array([2, 4, 6, 8])

Alternate elements were printed because the step-size was defined as 2. Notice that 10 was not printed as it was the last element.

<img src = "https://cdn.analyticsvidhya.com/wp-content/uploads/2020/04/np_arange.png">

Another similar function is **np.linspace()**, but instead of step size, it takes in the number of samples that need to be retrieved from the interval. A point to note here is that the last number is included in the values returned unlike in the case of np.arange().

<img src = "https://i0.wp.com/arrayjson.com/wp-content/uploads/2020/07/numpy.linspace-working-with-example.jpg?resize=751%2C212&ssl=1">

In [None]:
np.linspace(0,1,5)

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [None]:
np.linspace(1, 2, num=5)

array([1.  , 1.25, 1.5 , 1.75, 2.  ])

In [None]:
np.linspace(1, 2, num=11)

array([1. , 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2. ])

# **4- The Shape and Reshaping of NumPy Arrays**

## Dimensions of NumPy arrays
You can easily determine the number of dimensions or axes of a NumPy array using the ndims attribute:

In [None]:
import numpy as np

In [None]:
#number of axis
a = np.array([[5,10,15],
              [20,25,20]])

print('Array :','\n',a)
print('Dimensions :','\n',a.ndim)  

Array : 
 [[ 5 10 15]
 [20 25 20]]
Dimensions : 
 2


This array has two dimensions: 2 rows and 3 columns.

## Shape of NumPy array
The shape is an attribute of the NumPy array that shows how many rows of elements are there along each dimension. You can further index the shape so returned by the ndarray to get value along each dimension:

In [None]:
a = np.array([[1,2,3],
              [4,5,6]])

print('Array :','\n',a)
print('Shape :','\n',a.shape)
print('Rows = ',a.shape[0]) 
print('Columns = ',a.shape[1])

Array : 
 [[1 2 3]
 [4 5 6]]
Shape : 
 (2, 3)
Rows =  2
Columns =  3


<img src = "https://cdn.analyticsvidhya.com/wp-content/uploads/2020/04/np_shape-1.png">

## Reshaping a NumPy array 
Reshaping a ndarray can be done using the **np.reshape()** method. It changes the shape of the ndarray without changing the data within the ndarray:

In [None]:
# reshape
a = np.array([3,6,9,12])

np.reshape(a,(2,2))

array([[ 3,  6],
       [ 9, 12]])

In [None]:
# reshape
a = np.array([3,6,9,12])
np.reshape(a,(2,3))

ValueError: ignored

Here, I reshaped the ndarray from a 1-D to a 2-D ndarray.

While reshaping, if you are unsure about the shape of any of the axis, just input -1. NumPy automatically calculates the shape when it sees a -1:

In [None]:
a = np.array([3,6,9,12,18,24])

print('Three rows :','\n',np.reshape(a,(3,-1)))
print('Three columns :','\n',np.reshape(a,(-1,3)))

Three rows : 
 [[ 3  6]
 [ 9 12]
 [18 24]]
Three columns : 
 [[ 3  6  9]
 [12 18 24]]


## Flattening a NumPy array
Sometimes when you have a multidimensional array and want to collapse it to a single-dimensional array, you can either use the **flatten()** method or the **ravel()** method:

In [None]:
a = np.ones((2,2))

b = a.flatten()
c = a.ravel()
print('Original shape :', a.shape)
print('Array :','\n', a)
print('Shape after flatten :',b.shape)
print('Array :','\n', b)
print('Shape after ravel :',c.shape)
print('Array :','\n', c)

Original shape : (2, 2)
Array : 
 [[1. 1.]
 [1. 1.]]
Shape after flatten : (4,)
Array : 
 [1. 1. 1. 1.]
Shape after ravel : (4,)
Array : 
 [1. 1. 1. 1.]


<img src = "https://cdn.analyticsvidhya.com/wp-content/uploads/2020/04/np_flatten.png">

But an important difference between flatten() and ravel() is that the former returns a copy of the original array while the latter returns a reference to the original array. This means any changes made to the array returned from ravel() will also be reflected in the original array while this will not be the case with flatten().

In [None]:
b[0] = 0

print(a)

[[1. 1.]
 [1. 1.]]


The **change made** was **not** reflected in the **original array**.

In [None]:
c[0] = 0

print(a) 

[[0. 1.]
 [1. 1.]]


But here, the **changed value** is also reflected in the original ndarray.

## Transpose of a NumPy array
Another very interesting reshaping method of NumPy is the **transpose()** method. It takes the input array and swaps the rows with the column values, and the column values with the values of the rows:

In [None]:
a = np.array([[1,2,3],
              [4,5,6]]) 
b = np.transpose(a)
print('Original','\n','Shape',a.shape,'\n',a)
print('Expand along columns:','\n','Shape',b.shape,'\n',b)

Original 
 Shape (2, 3) 
 [[1 2 3]
 [4 5 6]]
Expand along columns: 
 Shape (3, 2) 
 [[1 4]
 [2 5]
 [3 6]]




<img src = "https://andymath.com/wp-content/uploads/2019/07/TransposeExamples.jpg">

On transposing a 2 x 3 array, we got a 3 x 2 array. Transpose has a lot of significance in linear algebra.

# **5- Indexing and Slicing of NumPy array**

So far, we have seen how to create a NumPy array and how to play around with its shape. In this section, we will see how to extract specific values from the array using indexing and slicing.

## Slicing 1-D NumPy arrays
Slicing means retrieving elements from one index to another index. All we have to do is to pass the starting and ending point in the index like this: [start: end].

However, you can even take it up a notch by passing the step-size. What is that? Well, suppose you wanted to print every other element from the array, you would define your step-size as 2, meaning get the element 2 places away from the present index.

Incorporating all this into a single index would look something like this: **[start:end:step-size]**.

In [None]:
a = np.array([1,2,3,4,5,6])

a[1:5:2]

array([2, 4])

Notice that the last element did not get considered. This is because slicing includes the start index but excludes the end index.

A way around this is to write the next higher index to the final index value you want to retrieve:

In [None]:
a = np.array([1,2,3,4,5,6])
print(a[1:6:2]) 

[2 4 6]


In [None]:
a = np.array([1,2,3,4,5,6])

print(a[:6:2])
print(a[1::2])
print(a[1:6:])

[1 3 5]
[2 4 6]
[2 3 4 5 6]


## Slicing 2-D NumPy arrays
Now, a 2-D array has rows and columns so it can get a little tricky to slice 2-D arrays. But once you understand it, you can slice any dimension array!

Before learning how to slice a 2-D array, let’s have a look at how to retrieve an element from a 2-D array:

In [None]:
a = np.array([[1,2,3],
              [4,5,6]])

print(a[0,0])

print(a[1,2])
print(a[1,0])

1
6
4


In [None]:
arr = np.array([[1, 2, 3, 4, 5], 
                [6, 7, 8, 9, 10]])

print(arr[1, 1:4])

[7 8 9]


In [None]:
arr = np.array([[1, 2, 3, 4, 5], 
                [6, 7, 8, 9, 10]])

print(arr[0:2, 2])

[3 8]


In [None]:
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])

print(arr[0:2, 1:4])

[[2 3 4]
 [7 8 9]]


In [None]:
arr = np.array([[1, 2, 3, 4, 5], 
                [6, 7, 8, 9, 10],
                [1, 8, 2, 7, 3]])

print(arr[0: , 0:2])

[[1 2]
 [6 7]
 [1 8]]


# **6- Stacking and Concatenating NumPy arrays**

## Stacking ndarrays
You can create a new array by combining existing arrays. This you can do in two ways:

Either combine the arrays vertically (i.e. along the rows) using the **vstack()** method, thereby increasing the number of rows in the resulting array
Or combine the arrays in a horizontal fashion (i.e. along the columns) using the **hstack()**, thereby increasing the number of columns in the resultant array

<img src = "https://cdn.analyticsvidhya.com/wp-content/uploads/2020/04/Stacking.png">

In [None]:
a = np.arange(0,5)
b = np.arange(5,10)

print('Array 1 :','\n',a)
print('Array 2 :','\n',b)

print('Vertical stacking :','\n',np.vstack((a,b)))

print('Horizontal stacking :','\n',np.hstack((a,b)))

Array 1 : 
 [0 1 2 3 4]
Array 2 : 
 [5 6 7 8 9]
Vertical stacking : 
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Horizontal stacking : 
 [0 1 2 3 4 5 6 7 8 9]


In [None]:
a.shape

(5,)

## Concatenating ndarrays
While stacking arrays is one way of combining old arrays to get a new one, you could also use the **concatenate()** method where the passed arrays are joined along an existing axis:

In [None]:
a = np.arange(0,5).reshape(1,5)
b = np.arange(5,10).reshape(1,5)

print('Array 1 :','\n',a)
print('Array 2 :','\n',b)

print('Concatenate along rows :','\n',np.concatenate((a,b),axis=0))
print('Concatenate along columns :','\n',np.concatenate((a,b),axis=1))

Array 1 : 
 [[0 1 2 3 4]]
Array 2 : 
 [[5 6 7 8 9]]
Concatenate along rows : 
 [[0 1 2 3 4]
 [5 6 7 8 9]]
Concatenate along columns : 
 [[0 1 2 3 4 5 6 7 8 9]]


In [None]:
a.shape

(1, 5)


<img src = "https://www.tutorialsandyou.com/images/2-Dimensional-concatenate-axis-0.jpg">

The drawback of this method is that the original array must have the axis along which you want to combine. Otherwise, get ready to be greeted by an error.

Another very useful function is the **append** method that adds new elements to the end of a ndarray. This is obviously useful when you already have an existing ndarray but want to add new values to it.

In [None]:
# append values to ndarray
a = np.array([[1,2],
             [3,4]])

np.append(a,[[5,6]], axis=0)

array([[1, 2],
       [3, 4],
       [5, 6]])

# **7- Maths with NumPy arrays**

Here are some of the most important and useful operations that you will need to perform on your NumPy array.

## Basic arithmetic operations on NumPy arrays
The basic arithmetic operations can easily be performed on NumPy arrays.

In [None]:
a = np.arange(1,6)
b = np.arange(6,11)

print('Subtract :',a-5)
print('Multiply :',a*5)
print('Divide :',a/5)
print('Power :',a**2)
print('Remainder :',a%5)

Subtract : [-4 -3 -2 -1  0]
Multiply : [ 5 10 15 20 25]
Divide : [0.2 0.4 0.6 0.8 1. ]
Power : [ 1  4  9 16 25]
Remainder : [1 2 3 4 0]


## Mean, Median and Standard deviation
To find the mean and standard deviation of a NumPy array, use the **mean()**, **std()** and **median()** methods:

In [None]:
a = np.arange(5,15,2)
print(a)
print('Mean :',np.mean(a))
print('Standard deviation :',np.std(a))
print('Median :',np.median(a)) 

[ 5  7  9 11 13]
Mean : 9.0
Standard deviation : 2.8284271247461903
Median : 9.0


## Min-Max values and their indexes
Min and Max values in an ndarray can be easily found using the **min()** and **max()** methods:

In [None]:
a = np.array([[1,6],
              [4,3]])

# minimum along a column
print('Min :',np.min(a,axis=0))
# maximum along a row
print('Max :',np.max(a,axis=1))

Min : [1 3]
Max : [6 4]


In [None]:
# minimum
print('Min :',np.min(a))
# maximum 
print('Max :',np.max(a))

Min : 1
Max : 6


You can also easily determine the index of the minimum or maximum value in the ndarray along a particular axis using the **argmin()** and **argmax()** methods:

In [None]:
a = np.random.randint(10, size=7) 
print(a)

# minimum along a column
print('Min :',np.argmin(a))
# maximum along a row
print('Max :',np.argmax(a))

[7 8 4 8 7 7 2]
Min : 6
Max : 1


# **8- Sorting in NumPy arrays**
The NumPy library It has a range of sorting functions that you can use to sort your array elements. It has implemented **quicksort**, **heapsort**, **mergesort**, and **timesort** for you under the hood when you use the **sort()** method:

In [None]:
a = np.array([1,4,2,5,3,6,8,7,9])
np.sort(a, kind='quicksort')

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

You can even sort the array along any axis you desire:

In [None]:
a = np.array([[5,6,7,4],
              [9,2,3,7]])
# sort along the column
print('Sort along column :','\n',np.sort(a, kind='mergresort',axis=1))
# sort along the row
print('Sort along row :','\n',np.sort(a, kind='quicksort',axis=0))

Sort along column : 
 [[4 5 6 7]
 [2 3 7 9]]
Sort along row : 
 [[5 2 3 4]
 [9 6 7 7]]
