# Introduction to NumPy

**NumPy** is a library written for scientific computing and data analysis. It stands for numerical python.

The most basic object in NumPy is the ```array```, which is **homogenous** in nature. By homogenous, we mean that all the elements in a numpy array have to be of the **same data type**, which is commonly numeric (float or integer). 

### Creating NumPy Arrays 

There are multiple ways to create numpy arrays, the most commmon ones being:
* Convert lists or tuples to arrays using ```np.array()```
* Initialise arrays of fixed size (when the size is known) 

In [None]:
! pip install numpy         # to insatall numpy in notebook

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# Import the numpy library
# np is simply an alias, you may use any other alias, though np is quite standard
import numpy as np
# print(np.__version__)    # check the version of the numpy library

1.21.6


In [None]:
import numpy as np   #calling numpy as 'np'

In [None]:
numbers=[1,2,3]  # list of the numbers
arr=np.array(numbers)    # creates a numpy array
arr

array([1, 2, 3])

### Numpy vs Lists 

In [None]:
# intialise a list with the following elements: 1, 2, 3, 4, 5
list_1=[1,2,3,4,5]

# try to multiply 2.5 directly to the created list

list_1*2.5

TypeError: ignored

Lists cannot operate over the entire data together. You need to run _map_ or _lambda_ functions to multiply each element with 2.5. 

In [None]:
def mul(x):
    return x*2.5
print(list(map(mul,list_1)))

[2.5, 5.0, 7.5, 10.0, 12.5]


In [None]:
list_2=list(map(lambda x: x*2.5,list_1))

In [None]:
print(list_2)

[2.5, 5.0, 7.5, 10.0, 12.5]


In [None]:
list_1*2         # if we multiply list with an integer it will perform repetition

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

Let's try to perform the same using a NumPy array.

In [None]:
list_1

[1, 2, 3, 4, 5]

In [None]:
# Creating a 1-D array using a list
# np.array() takes in a list or a tuple as argument, and converts into an array

np_list = np.array(list_1)

In [None]:
np_list

array([1, 2, 3, 4, 5])

In [None]:
# Multiply each element with 2.5 

np_list*2.5

array([ 2.5,  5. ,  7.5, 10. , 12.5])

In [None]:
# Print the result using the print command. Compare the structure of list and array.
print(np_list,list_1)

[1 2 3 4 5] [1, 2, 3, 4, 5]


## Creating 1D numpy array using different data types

In [None]:
# Create a 1-D array using with the elements (1, abc, True)

np.array([1,"abc",True])

array(['1', 'abc', 'True'], dtype='<U21')

In [None]:
# Create a 1-D array using with the elements (1, abc, True)

a=np.array([1,"abc",True])
print(a)

['1' 'abc' 'True']


All the elements in a numpy array have the same data type. You can check the data type of an array using the `.dtype`

The type of the data can be described by type attribute as `.dtype.type`

In [None]:
print('type(a): ',type(a))                  # check the data type of the array
print('a.dtype: ',a.dtype)                  # check the data type of the elements of the array
print('a.dtype.type: ',a.dtype.type)        # check the data type of the elements of the array


type(a):  <class 'numpy.ndarray'>
a.dtype:  <U21
a.dtype.type:  <class 'numpy.str_'>


In [None]:
x=np.array(['Apple','Banana','Orange'])
print(type(x))
print(x.dtype)
print(x.dtype.type)

<class 'numpy.ndarray'>
<U6
<class 'numpy.str_'>


In [None]:
list_1=[5,1,5,9]

In [None]:
list_1=np.array(list_1)
print(list_1.dtype)
print(list_1.dtype.type)

int64
<class 'numpy.int64'>


In [None]:
list_2=[5,1.5,5,9]
np_list_2=np.array(list_2)
print(np_list_2.dtype)

float64


### Checking shape, dimensions, size

Numpy arrays can have any number of dimensions and different lengths along each dimension. We can inspect the length along each dimension using the `.shape `property of an array.


You can get the number of dimensions, and size (number of all elements) of the NumPy array with ndim and size attributes of numpy.ndarray. The built-in function len() returns the size of the first dimension.

- Number of dimensions of the NumPy array: `.ndim`

- Size of the NumPy array: `.size`

- Size of the first dimension of the NumPy array: `len()`


In [None]:
list_1=[5,1,5,9]
arr=np.array(list_1)
print('arr.shape',arr.shape)       # check the shape of the numpy array
print('arr.ndim: ',arr.ndim)       # check the dimensions of the numpy array 
print('arr.size: ',arr.size)       # check the size of the numpy array
print('len(arr): ',len(arr))

arr.shape (4,)
arr.ndim:  1
arr.size:  4
len(arr):  4


In [None]:
# create a numpy array of the single element
list_2=2
list_2=np.array(list_2)
print('list_2.ndim: ',list_2.ndim)
print('list_2.shape: ',list_2.shape)
print('type(list_2): ',type(list_2))
print('list_2.dtype: ',list_2.dtype)

list_2.ndim:  0
list_2.shape:  ()
type(list_2):  <class 'numpy.ndarray'>
list_2.dtype:  int64


## Creating 2D and 3D array
![](https://i.imgur.com/pMjzthM.png)

Numpy is used to create multi dimensional arrays. For 2D array it is having sahpe along two axes axis 0 for rows and axis 1 for columns if we consider it as mxn matrix. Elements in 2D arrays are similar to list of lists. for example,[[1,2,3],[4,5,6]]. The given array will have shape (2,3). 

For 3D array it will have shape along axis 2 as depth, axis 0 as height and axis 1 as length of the cube. for example [[[1,2,3],[4,5,6],[7,8,9]]]. In this example the shape of the array is (1,3,3)

In [None]:
# Creating a 2D array
list_1=[[1,2,3],[4,5,6]]
arr1=np.array(list_1)
arr1

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
print(arr1.shape)
print(arr1.ndim)
print(arr1.size)

(2, 3)
2
6


In [None]:
# creating a 3D array
list_2=[[[1,2,3],[4,5,6]]]
arr2=np.array(list_2)
print('arr12 \n',arr2)
print('arr2.shape: ',arr2.shape)
print('arr2.ndim: ',arr2.ndim)
print('arr2.size: ',arr2.size)

arr12 
 [[[1 2 3]
  [4 5 6]]]
arr2.shape:  (1, 2, 3)
arr2.ndim:  3
arr2.size:  6


In [None]:
list_x=np.array([[[1],[2],[3]],[[4],[5],[6]]])
list_x

array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])

In [None]:
list_x2=np.array([[[1],[2],[3]],[[4],[5],[6]],[[1],[2],[3]],[[4],[5],[6]]])
list_x2

array([[[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]],

       [[1],
        [2],
        [3]],

       [[4],
        [5],
        [6]]])

## Array slicing

![](https://i.imgur.com/tcgquis.png)

To slice a one dimensional array, I provide a start and an end number separated by a semicolon (:). The range then starts at the start number and one before the end number.




# Manipulating Arrays

![](https://i.imgur.com/mg7tMCb.png)

Here are two techniques that can be used to change or manipulayte the structure of arrays:
*  Reshaping
*  Stacking
*  Concatenating
*  Sorting
*  Padding
*  Splitting


### Reshaping

In [None]:
# Creating an array of first ten natural numbers
arr_4 = np.arange(10)
arr_4
# Printing the created array
# arr_4,arr_4.shape



array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
# Reshaping the 1D array into a 2D array with 2 elements in five rows
arr_5 = arr_4.reshape(5,2)

In [None]:
# Checking the rows and columns in the created array
arr_5

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

In [None]:
# Creating a 2D array with 4 elements in five rows (elements from 20 to 39)
arr_6 = np.arange(20,40).reshape(5,4)

# Checking the rows and columns in the created array
arr_6

array([[20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31],
       [32, 33, 34, 35],
       [36, 37, 38, 39]])

In [None]:
arr_6.shape

(5, 4)

In [None]:
arr_6,arr_7

(array([[20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31],
        [32, 33, 34, 35],
        [36, 37, 38, 39]]),
 array([[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35, 36, 37, 38, 39]]))

In [None]:
arr_7.flatten()          # return 1D array

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
       37, 38, 39])

In [None]:
np.ravel(arr_7)         # return flatten array

array([20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
       37, 38, 39])

### Stacking

In [None]:
#importing the NumPy library
import numpy as np

# Creating two 1-D arrays with 5 elements using arange
arr_1 = np.array([1,2,3,4,5,9])
arr_2 = np.array([6,7,8,9,10,20])


In [None]:
# Horizonal stacking - Appending the elements in the same row
ar_h = np.hstack((arr_1, arr_2))
# display the concatenated array
print(ar_h)


[ 1  2  3  4  5  9  6  7  8  9 10 20]


In [None]:
# Vertical stacking - Increasing the number of row

ar_v = np.vstack((arr_1, arr_2))
# display the concatenated array
print(ar_v)




[[ 1  2  3  4  5  9]
 [ 6  7  8  9 10 20]]


### Concatenating

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3])

arr2 = np.array([4, 5, 6])

arr = np.concatenate((arr1, arr2))

print(arr)

[1 2 3 4 5 6]


In [None]:
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2), axis=1)
print('arr1')
print(arr1)
print('arr2')
print(arr2)
print('concatenated arr')
print(arr)
print(arr.shape)

arr1
[[1 2]
 [3 4]]
arr2
[[5 6]
 [7 8]]
concatenated arr
[[1 2 5 6]
 [3 4 7 8]]
(2, 4)


In [None]:
import numpy as np

arr1 = np.array([[1, 2], [3, 4]])

arr2 = np.array([[5, 6], [7, 8]])

arr = np.concatenate((arr1, arr2))

print('arr1')
print(arr1)
print('arr2')
print(arr2)
print('concatenated arr')
print(arr)
print(arr.shape)

arr1
[[1 2]
 [3 4]]
arr2
[[5 6]
 [7 8]]
concatenated arr
[[1 2]
 [3 4]
 [5 6]
 [7 8]]
(4, 2)


## Sort

In [None]:
arr_8=np.array([3,5,4,9,8,7,4,5])

np.sort(arr_8)

array([3, 4, 4, 5, 5, 7, 8, 9])

In [None]:
a = np.array([[4,10],[3,2]])
print(a)
print('-----------')
print(np.sort(a,axis=0)) # sort along the first axis
print('-----------')
print(np.sort(a)) # sort along the last axis

[[ 4 10]
 [ 3  2]]
-----------
[[ 3  2]
 [ 4 10]]
-----------
[[ 4 10]
 [ 2  3]]


In [None]:
np.random.seed(1)
b=np.random.randint(0, 10, (2,3,3))
print('Array b')
print(b)
print('-----------')
print('sorted array b along axis=0')
print(np.sort(b,axis=0)) # sort along the first axis
print('-----------')
print('sorted array b along axis=1')
print(np.sort(b,axis=1)) # sort along the second axis
print('-----------')
print('sorted array b along last axis')
print(np.sort(b)) # sort along the last axis


Array b
[[[5 8 9]
  [5 0 0]
  [1 7 6]]

 [[9 2 4]
  [5 2 4]
  [2 4 7]]]
-----------
sorted array b along axis=0
[[[5 2 4]
  [5 0 0]
  [1 4 6]]

 [[9 8 9]
  [5 2 4]
  [2 7 7]]]
-----------
sorted array b along axis=1
[[[1 0 0]
  [5 7 6]
  [5 8 9]]

 [[2 2 4]
  [5 2 4]
  [9 4 7]]]
-----------
sorted array b along last axis
[[[5 8 9]
  [0 0 5]
  [1 6 7]]

 [[2 4 9]
  [2 4 5]
  [2 4 7]]]


In [None]:
np.set_printoptions(threshold=100)
a = [1, 2, 3, 4, 5]
np.pad(a, (2, 3), 'constant', constant_values=(4, 6))


array([4, 4, 1, 2, 3, 4, 5, 6, 6, 6])

In [None]:
np.pad(a, (3, 3), 'constant', constant_values=(0,0))

array([0, 0, 0, 1, 2, 3, 4, 5, 0, 0, 0])

In [None]:
a = [[1, 2], [3, 4]]
print(np.array(a))
print('___________________')

print(np.pad(a, ((3, 2), (2, 3)), 'minimum'))

[[1 2]
 [3 4]]
___________________
[[1 1 1 2 1 1 1]
 [1 1 1 2 1 1 1]
 [1 1 1 2 1 1 1]
 [1 1 1 2 1 1 1]
 [3 3 3 4 3 3 3]
 [1 1 1 2 1 1 1]
 [1 1 1 2 1 1 1]]


In [None]:
np.pad(a, (1, 1), 'constant')

array([[0, 0, 0, 0],
       [0, 1, 2, 0],
       [0, 3, 4, 0],
       [0, 0, 0, 0]])

## Splitting NumPy Arrays

Splitting is reverse operation of Joining.

Joining merges multiple arrays into one and Splitting breaks one array into multiple.

We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 3)

print(newarr)

[array([1, 2]), array([3, 4]), array([5, 6])]


If the array has less elements than required, it will adjust from the end accordingly.



In [None]:
#Split the array in 4 parts:

import numpy as np

arr = np.array([1, 2, 3, 4, 5, 6])

newarr = np.array_split(arr, 4)

print(newarr)

[array([1, 2]), array([3, 4]), array([5]), array([6])]


**Splitting 2-D Arrays**

Use the same syntax when splitting 2-D arrays.

Use the array_split() method, pass in the array you want to split and the number of splits you want to do.



In [None]:
#Split the 2-D array into three 2-D arrays.

import numpy as np

arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])

newarr = np.array_split(arr, 3)

print(newarr)

[array([[1, 2],
       [3, 4]]), array([[5, 6],
       [7, 8]]), array([[ 9, 10],
       [11, 12]])]


In [None]:
#Split the 2-D array into three 2-D arrays.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

newarr = np.array_split(arr, 3)

print(newarr)

[array([[1, 2, 3],
       [4, 5, 6]]), array([[ 7,  8,  9],
       [10, 11, 12]]), array([[13, 14, 15],
       [16, 17, 18]])]


In [None]:
#Split the 2-D array into three 2-D arrays along rows.

import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])

newarr = np.array_split(arr, 3, axis=1)

print(newarr)

[array([[ 1],
       [ 4],
       [ 7],
       [10],
       [13],
       [16]]), array([[ 2],
       [ 5],
       [ 8],
       [11],
       [14],
       [17]]), array([[ 3],
       [ 6],
       [ 9],
       [12],
       [15],
       [18]])]


# Excercise

Border Rows and Columns
Description
Extract all the border rows and columns from a 2-D array.

Format:
Input: A 2-D Python list
Output: Four NumPy arrays - First column of the input array, first row of the input array, last column of the input array, last row of the input array respectively.

Example:<br>
Input 1:<br>
[[11 12 13 14]<br>
 [21 22 23 24]<br>
 [31 32 33 34]]<br>
Output 1:<br>
[11 21 31]<br>
[11 12 13 14]<br>
[14 24 34]<br>
[31 32 33 34]<br>

In [None]:
input_list=[[11, 12, 13, 14],
            [21, 22, 23, 24],
            [31, 32, 33, 34]]

In [None]:
# Convert the input list to a NumPy array
array_2d =np.array(input_list)

In [None]:
array_2d

In [None]:
# Extract the first column, first row, last column and last row respectively using
# appropriate indexing
col_first = array_2d[:,0]
row_first = array_2d[0]
col_last = array_2d[:,3]
row_last = array_2d[2]
print(col_first)

print(row_first) 

print(col_last) 

print(row_last)

# NumPy Cheat Sheets

[Cheat Sheet 1: DataCamp NumPy](https://s3.amazonaws.com/assets.datacamp.com/blog_assets/Numpy_Python_Cheat_Sheet.pdf)

[Cheat Sheet 2: Basic NumPy](https://cdn.intellipaat.com/mediaFiles/2018/12/Python-NumPy-Cheat-Sheet-.pdf)

[Cheat Sheet 3: A Little Bit of Everything](http://datasciencefree.com/numpy.pdf)

[Cheat Sheet 4: Data Science](https://s3.amazonaws.com/dq-blog-files/numpy-cheat-sheet.pdf)

[Cheat Sheet 5: Scientific Python](https://ipgp.github.io/scientific_python_cheat_sheet/?utm_content=buffer7d821&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer#numpy-import-numpy-as-np)

