# Numpy, Pandas and Visualization [NPV]

## Numpy

#### NumPy is a Python library that is used for mathematical and scientific computations, which contains multi-dimensional array and matrices. With NumPy, you can perform various mathematical operations on arrays easily, like adding, subtracting, multiplying, and dividing elements, as well as more advanced operations like matrix multiplication. It's widely used in scientific computing, data analysis, and machine learning because it's fast and powerful for handling large amounts of numerical data.

In [1]:
# To install NumPy using pip:
! pip install numpy



In [2]:
# To import Numpy:
import numpy as np

### What are NumPy Arrays?

* A numpy array looks similar to a list
* An array is a grid of values, indexed by a tuple of positive integers.
* It usually contains numeric values. However it can contain string values too.
* An array can be n-dimensional.

* While working with NumPy for data science, mostly we have to deal with NumPy arrays.These are generally of three types :
1. 1D Array (One-dimensional array/Vectors):
  * Also called a flat array.
  * 
Elements are arranged in a single row or column
  * 
It's like a list of elements stored in a linear sequenc
  * .
Example: [1, 2, 3, 4, 5]

2. 2D Array (Two-dimensional array/Matrix):
   * Elements are arranged in rows and columns, forming a grid-like structure.
   * It's like a table with rows and columns.
   * Each element is accessed using two indices: row index and column index., 9]]

In [3]:
# example of 2-D Array:
[[1, 2, 3],
 [4, 5, 6],
 [7, 8, 9]]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

3. 3D Array (Three-dimensional array/3-D tensor):
* Elements are arranged in a three-dimensional space, forming a cuboid-like structure.
* It's like a collection of 2D arrays stacked together.
* Each element is accessed using three indices: depth index, row index, and column index.:

In [4]:
# example of 3-D array
[[[1, 2],
  [3, 4]],
 
 [[5, 6],
  [7, 8]]]


[[[1, 2], [3, 4]], [[5, 6], [7, 8]]]

#### Limitations of traditional Python data structures:
* Slow for numbers: Regular lists and other structures are slow when dealing with lots of numbers.
* Waste memory: They might use more memory than needed when storing numbers.
* Not good for math: They're not designed for doing lots of math quickly.
* Mixing types is messy: They can handle different types of data, which isn't ideal for math-heavy tasks.

#### Why arrays came into action:
* Save space: Arrays use memory more efficiently, so they're better for storing lots of numbers.
* Stick to one type: Arrays only deal with one type of data, making them faster for math.
* Math made easy: Arrays let you do lots of math really quickly and easily.
* Specialized tools: Arrays are the foundation for special tools like NumPy, which make doing math in Python super fast and powerful.




#### Creating array from a list 

* np.array(list) is used to convert list into array.

In [5]:
l = [23,45,67,21,89]
print(l)
print(type(l))

[23, 45, 67, 21, 89]
<class 'list'>


In [6]:
ar = np.array(l)
print(ar)
print(type(ar))

[23 45 67 21 89]
<class 'numpy.ndarray'>


In [7]:
# Ques: Create a list of weight of 12 people. Convert the list to array and check the data type
weight_list = [45,56,67,78,90,60,76,47,83,39,41,50]
weight_array = np.array(weight_list)
print(weight_array)
print(type(weight_array))

[45 56 67 78 90 60 76 47 83 39 41 50]
<class 'numpy.ndarray'>


In [8]:
# Ques: Convert the list ['hello',78,12.23,True,8+5j] to array and check the datatype of the 
# elements. Remove 'hello' and check the datatype again.
li = ['hello',78,12.23,True,8+5j]
li_array = np.array(li)
print(li_array)

['hello' '78' '12.23' 'True' '(8+5j)']


* We know that arrays are Homogenous in nature and here the list li is hetrogenous in nature.
* When we convert the list li into an array the elements in the list convert into the string which make the array homogenous in nature.
* Hetrogenous array conversion sequence:
* String > Complex > Float > Integer > Boolean

#### Creating the numpy array of random numbers 
#### (.random(),.randn(),.randint())

* numpy.random.random(): This function creates an array of the given shape and fills it with random samples from a uniform distribution over [0, 1).


In [9]:
# Crrating a 1D array of 5 elements between 0 and 1 
random_array = np.random.random(5)
print(random_array)

[0.5884066  0.18939262 0.16072895 0.66139872 0.94449646]


* numpy.random.randn(): This function creates an array of the given shape and fills it with random samples from a standard normal distribution (mean=0, standard deviation=1).

In [10]:
# Creating a 1D array of 5 random numbers from a standard normal distribution 
normal_array = np.random.randn(5)
print(normal_array)

[-1.40686204 -2.50954109  0.8915289   0.85708655  0.12581126]


* numpy.random.randint(): This function returns random integers from low (inclusive) to high (exclusive) with a specified shape.

In [11]:
# creating a 1D array of 5 random intergers between 0 and 9 
randint_array = np.random.randint(0,10,size=5)
print(randint_array)

[7 4 1 5 2]


#### Creating array from np.arange(),np.linspace()

* np.arange(): This function creates an array containing evenly spaced values within a given range.

                         syntax : (start,stop,step)

In [12]:
array_arange = np.arange(0,101,10)   #here, 
print(array_arange)                  #10 is start point and 101 is stop point(excluded 101)
                                     #and 10 is the step 

[  0  10  20  30  40  50  60  70  80  90 100]


* np.linspace(): This function creates an array of evenly spaced numbers over a specified interval.

                           synatax : (start,stop,num) 

In [13]:
array_linspace = np.linspace(0,101,10,dtype=int)   #here, 
print(array_linspace)                              #10 is start point and 101 is the stop point(101 include)
                                                   #and 10 is the number of points to generate

[  0  11  22  33  44  56  67  78  89 101]


##### Difference between arange and linspace 

* Arange generates evenly spaced values within a specified range using a specified step size whereas linspace generates evenly spaced numbers over a specified interval
* Arange produces an array that starts at the start value, ends before reaching the stop value, and increments by the step value whereas linspace produces an array that includes both the start and stop values, with a total of num equally spaced points in between.
* The step parameter in arange determines the spacing between values in the array whereas the num in linspace parameter specifies the number of points to generate, not the spacing between them.

#### Attributes of array 

Attributes are the features/characteristics of an object that describes the object.
A few attributes of the numpy array are:
* size: It gives you the total count of elements in the array.
* shape: It returns a tuple representing the size of each dimension of the array.
* ndim: It returns an integer representing the number of dimensions in the array.
* dtype: It returns a numpy data type object representing the type of elements in the array..pe

In [14]:
ar.size     #gives the number of elemnet in a array,which means there are 5 elements the array 'ar'

5

In [15]:
ar.shape   #The first element of the tuple represents the number of rows, 
           #the second element represents the number of columns, which means there are 5 rows in the array 'ar'

(5,)

In [16]:
ar.ndim     #gives the number of dimensions or axes of the array, which means the array 'ar' is 1 dimensional

1

In [17]:
ar.dtype    #describe the datatype of the array, which means the datatype for array 'ar' is integer

dtype('int32')

#### Creating Identity,zero,full and eye matrix 

* An identity matrix is a square matrix with all diagonal elements equal to 1 and all other elements equal to 0.

In [18]:
np.identity(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

* A zero matrix is a matrix where all elements are zero.

In [19]:
np.zeros((4,3),dtype=int)

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

* A full matrix is a matrix where all elements have the same specified value.

In [20]:
np.full((4,4),5)

array([[5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5],
       [5, 5, 5, 5]])

* An eye matrix is a square matrix with ones on the diagonal and zeros elsewhere.

In [21]:
np.eye(4,3,k=1)  #k parameter shifts the digonal of the matrix  

array([[0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.],
       [0., 0., 0.]])

* The difference between identity and eye matrix is that identity matrix is a square matrix whereas eye matrix can take any shape.
* Eye matrix have a k parameter which shifts the digonal of the matrix whereas identity matrix doesn't have k parameter.

#### 2D Array, 3D Array and , reshape()

In [22]:
le = [[3,6,9],[4,8,12],[5,10,15]]      #nested list make the 2D array
ar2 = np.array(le)
print(ar2)  

[[ 3  6  9]
 [ 4  8 12]
 [ 5 10 15]]


In [23]:
print(ar2.ndim)

2


In [24]:
lee = [[[2,4,5,6],[9,67,89,45],[90,35,67,97]],[[45,67,89,12],[45,78,56,70],[23,45,34,12]]]   #double nested list make 3D array
ar3 = np.array(lee)
print(ar3)

[[[ 2  4  5  6]
  [ 9 67 89 45]
  [90 35 67 97]]

 [[45 67 89 12]
  [45 78 56 70]
  [23 45 34 12]]]


In [25]:
print(ar3.ndim)

3


In [26]:
arr1 = np.random.randint(20,50,20)
arr1

array([30, 49, 44, 34, 33, 43, 48, 26, 43, 42, 43, 22, 43, 36, 29, 35, 45,
       49, 21, 39])

In [27]:
arr1.ndim

1

In [28]:
# Converting this 1D array into 2D array 
# We are using .reshape function for doing the same 
# We choose rows and columns in such a way that its multipy is equal to no. of elements of 1D array
arr2 = arr1.reshape(5,4)
print(arr2)

[[30 49 44 34]
 [33 43 48 26]
 [43 42 43 22]
 [43 36 29 35]
 [45 49 21 39]]


In [29]:
print(arr2.ndim)

2


In [30]:
# Simlarily Converting this 1D Array into 3D array
arr3 = arr1.reshape(2,5,2)
print(arr3)

[[[30 49]
  [44 34]
  [33 43]
  [48 26]
  [43 42]]

 [[43 22]
  [43 36]
  [29 35]
  [45 49]
  [21 39]]]


In [31]:
arr3.ndim

3

In [32]:
# Converting 2D, 3D array into 1D array, for that we use .flatten() 
arr2.flatten()

array([30, 49, 44, 34, 33, 43, 48, 26, 43, 42, 43, 22, 43, 36, 29, 35, 45,
       49, 21, 39])

### Indexing and Slicing of array 

Index 1D Array:

Each element in the array can be accessed by passing the positional index of the element . The index for an array starts at 0 from left, It starts at -1 from the right.

In [33]:
# genearting a random array 
arr = np.random.randint(1,100,10)
print(arr)

[24 93 84  6 56 70 95  8 68 10]


In [34]:
# Indexing of this 1D array 
print(arr[0])
print(arr[3])
print(arr[-1])
# Slicing 
print(arr[0:6])
print(arr[0::2])
print(arr[::-1])

24
6
10
[24 93 84  6 56 70]
[24 84 56 95 68]
[10 68  8 95 70 56  6 84 93 24]


Index 2D Array: 

Elements in 2D array can be accessded by the row and column indices. We can also select a specific row or column by passing the respective index.

In [35]:
arr2 = arr.reshape(5,2)
print(arr2)

[[24 93]
 [84  6]
 [56 70]
 [95  8]
 [68 10]]


In [36]:
# Indexing in 2D array
print(arr2[0])
#Slicing in 2D array: arr2[rows,column]
print(arr2[1:3,0:])
print(arr2[3:,0:])
print(arr2[2:3])

[24 93]
[[84  6]
 [56 70]]
[[95  8]
 [68 10]]
[[56 70]]


Index 3D Array:

In [37]:
arr3 

array([[[30, 49],
        [44, 34],
        [33, 43],
        [48, 26],
        [43, 42]],

       [[43, 22],
        [43, 36],
        [29, 35],
        [45, 49],
        [21, 39]]])

In [38]:
#Indexing
print(arr3[0])        #will give the first matrix 
print(arr3[1])        #will give the second matrix
#Slicing : [matrix:rows:columns]
print(arr3[1,1:3,0:1])
print(arr3[0,1:3,:])
print(arr3[:,:,1:])

[[30 49]
 [44 34]
 [33 43]
 [48 26]
 [43 42]]
[[43 22]
 [43 36]
 [29 35]
 [45 49]
 [21 39]]
[[43]
 [29]]
[[44 34]
 [33 43]]
[[[49]
  [34]
  [43]
  [26]
  [42]]

 [[22]
  [36]
  [35]
  [49]
  [39]]]


### Comparison Operation on Array

While working with list, we cannot directly use the comparison operator whereas while working with array we can directly us ethe comparison operators.

In [39]:
arr

array([24, 93, 84,  6, 56, 70, 95,  8, 68, 10])

In [40]:
arr>70      #Here, we are getting the output in boolean form to filter out we will use square bracket

array([False,  True,  True, False, False, False,  True, False, False,
       False])

In [41]:
arr[arr>70]    #Here, we are filtering the elements of arr which are greater than 70

array([93, 84, 95])

In [42]:
arr[arr%3==0]

array([24, 93, 84,  6])

In [43]:
#Now Suppose we want both elelments of an array which are greater thanm 70 and divisiable by 3
#arr[arr>70 and arr%3==0]     # using and will throw an error here, so a precise way to write this is 

arr[(arr>70) & (arr%3==0)]   # using parnthesis for different condition and use bitwise and (&)

array([93, 84])

### Arithmetic Operations on Array

In [44]:
# Addition on Array 
ar_1 = np.random.randint(10,50,7)
ar_2 = np.random.randint(10,50,7)

In [45]:
ar_1

array([14, 41, 37, 29, 17, 48, 26])

In [46]:
ar_2

array([34, 19, 34, 14, 34, 49, 13])

In [47]:
ar_1+ar_2

array([48, 60, 71, 43, 51, 97, 39])

In [48]:
# subtraction on array 
ar_1 - ar_2

array([-20,  22,   3,  15, -17,  -1,  13])

In [49]:
# Addition and Subtraction of matrix
mat_1 = np.random.randint(10,50,12).reshape(3,4)
mat_2 = np.random.randint(10,50,12).reshape(3,4)

In [50]:
mat_1

array([[12, 24, 28, 49],
       [40, 39, 12, 20],
       [24, 26, 16, 21]])

In [51]:
mat_2

array([[23, 44, 33, 26],
       [29, 35, 22, 13],
       [17, 37, 10, 35]])

In [52]:
# addition of matrix
mat_1 + mat_2

array([[35, 68, 61, 75],
       [69, 74, 34, 33],
       [41, 63, 26, 56]])

In [53]:
# subtraction of matrix 
mat_1 - mat_2

array([[-11, -20,  -5,  23],
       [ 11,   4, -10,   7],
       [  7, -11,   6, -14]])

In [54]:
# multiplication 
# element wise multiplication
mat_1*mat_2

array([[ 276, 1056,  924, 1274],
       [1160, 1365,  264,  260],
       [ 408,  962,  160,  735]])

In [55]:
# Matrix multiplication 
# Rule for matrix multiplication: The column of first matrix should be equal to row of second matrix


In [56]:
mat1 = np.random.randint(10,50,12).reshape(4,3)
mat2 = np.random.randint(10,50,12).reshape(3,4)

In [57]:
mat1

array([[32, 40, 18],
       [40, 49, 35],
       [23, 10, 49],
       [33, 44, 29]])

In [58]:
mat2

array([[34, 44, 16, 34],
       [10, 44, 15, 13],
       [18, 14, 34, 14]])

In [59]:
# mat1 ==> shape ==> 4 x 3 
# mat2 ==> shape ==> 3 x 4
# Col1 == row2
# resultant matrix ==> 4 x 4 

mat1.dot(mat2)    # .dot() is a function used for matrix multiplication 

array([[1812, 3420, 1724, 1860],
       [2480, 4406, 2565, 2487],
       [1764, 2138, 2184, 1598],
       [2084, 3794, 2174, 2100]])

### Arithmetic Functions on Array

In [60]:
arr

array([24, 93, 84,  6, 56, 70, 95,  8, 68, 10])

In [61]:
# min()
# array 
print(min(arr))
print(arr.min())  # a better way to do 

6
6


In [62]:
# Matrix 
mat1.min(axis = 0)
# axis = 0 : across row   (min of columns we have to traverse through row)
# axis = 1 : across column (min of row we have to traverse through coloumn)

array([23, 10, 18])

In [63]:
# max()
# array 
print(arr.max())
# matrix 
print(mat1.max(axis=0))

95
[40 49 49]


In [64]:
# sum()
# array 
print(arr.sum())
# matrix
print(mat1.sum())         # sum of all the elements in the matrix 
print(mat1.sum(axis=0))   # sum of elements across row
print(mat1.sum(axis=1))   # sum of elements across column

514
402
[128 143 131]
[ 90 124  82 106]


In [65]:
# mean()
# array 
print(arr.mean())
# Matrix 
print(mat1.mean())         # mean of all the elements in matrix 
print(mat1.mean(axis=0))   # mean of elements across row
print(mat1.mean(axis=1))   # mean of elements across column

51.4
33.5
[32.   35.75 32.75]
[30.         41.33333333 27.33333333 35.33333333]


In [66]:
# square()
# array 
print(np.square(arr))  # square all the elements in an array 
# matrix
print(np.square(mat1)) 

[ 576 8649 7056   36 3136 4900 9025   64 4624  100]
[[1024 1600  324]
 [1600 2401 1225]
 [ 529  100 2401]
 [1089 1936  841]]


In [67]:
# power()
# array 
print(np.power(arr,3))
# matrix
print(np.power(mat1,3))

# cube could also be found in this way 
print(mat1**3)

[ 13824 804357 592704    216 175616 343000 857375    512 314432   1000]
[[ 32768  64000   5832]
 [ 64000 117649  42875]
 [ 12167   1000 117649]
 [ 35937  85184  24389]]
[[ 32768  64000   5832]
 [ 64000 117649  42875]
 [ 12167   1000 117649]
 [ 35937  85184  24389]]


In [68]:
# transpose()
# it swaps the rows and columns of a matrix
print(mat1)
print(np.transpose(mat1))

# another way 
mat1.T

[[32 40 18]
 [40 49 35]
 [23 10 49]
 [33 44 29]]
[[32 40 23 33]
 [40 49 10 44]
 [18 35 49 29]]


array([[32, 40, 23, 33],
       [40, 49, 10, 44],
       [18, 35, 49, 29]])

### Conatination of 1D Array and 2D Array

In [69]:
arr1

array([30, 49, 44, 34, 33, 43, 48, 26, 43, 42, 43, 22, 43, 36, 29, 35, 45,
       49, 21, 39])

In [70]:
arr

array([24, 93, 84,  6, 56, 70, 95,  8, 68, 10])

In [71]:
np.concatenate([arr1,arr])

array([30, 49, 44, 34, 33, 43, 48, 26, 43, 42, 43, 22, 43, 36, 29, 35, 45,
       49, 21, 39, 24, 93, 84,  6, 56, 70, 95,  8, 68, 10])

In [72]:
mat1

array([[32, 40, 18],
       [40, 49, 35],
       [23, 10, 49],
       [33, 44, 29]])

In [73]:
mat22=mat2.reshape(4,3)

In [74]:
np.concatenate([mat1,mat22],axis=0)   # number of columns in the matrix must be same 

array([[32, 40, 18],
       [40, 49, 35],
       [23, 10, 49],
       [33, 44, 29],
       [34, 44, 16],
       [34, 10, 44],
       [15, 13, 18],
       [14, 34, 14]])