# NumPy Arrays 

### What is NumPy ?

**NumPy stands for "Numerical Python". NumPy is a Python library used for working with arrays. It also has functions for working in domain like linear algebra, fourier transform and matrices.**

### Creation of NumPy >>>

>It was created by Travis Oliphant in 2005.  

### Why use NumPy ?

>We originally use lists in python for data manipulation and order, also they serve the purpose of arrays, but that is just too slow to process. NumPy has it's own array object called "ndarray" which is almost 50x faster than arrays made of list. 

**Data Science: is a branch of computer science where we study how to store, use and analyze data for deriving information from it. NumPy is used a lot in Data Science.**

Below are the data-types used in numpy >>>

i - integer

b - boolean

u - unsigned integer

f - float

c - complex float/int

m - timedelta

M - datetime

O - object

S - string

U - unicode string

V - fixed chunk of memory for other type ( void )

In [1]:
import numpy as np   #importing numpy
print(np.__version__)   #Checking for numpy version

#Created an array 
array_sample = np.array([1, 2, 3]) 
print(array_sample)         #slicing the array
print(array_sample.dtype)   #slicing the type of data used in array
print(type(array_sample))   #returns the type of array 

2.3.5
[1 2 3]
int64
<class 'numpy.ndarray'>


## N-Dimensional Arrays >>>

**Arrays can have dimensions from 0 to n. This multidimensional arrays allows us to work on scientific computations and data analysis.**

1. 0 Dimensional array >>> It has only one row and a column.

In [2]:
import numpy as np

#Creating a zero dimensional array
zero_arr = np.array(10)   
print(zero_arr)

10


2. 1 Dimensional array >>> It has single row but multiple columns and vice versa.

In [3]:
import numpy as np

#Creating a one dimensional array
one_arr = np.array([10, 5, 2, 3, 70])    
print(one_arr)

[10  5  2  3 70]


3. 2 Dimensional array >>> It has multiple rows and columns. It is also called as a 'Matrix'.

In [4]:
import numpy as np

#creating a two dimensional array.
two_arr = np.array([[1, 3, 5, 7], [2, 4, 6, 8], [9, 11, 13, 15], [10, 12, 14, 16]]) 
print(two_arr)

[[ 1  3  5  7]
 [ 2  4  6  8]
 [ 9 11 13 15]
 [10 12 14 16]]


4. 3 Dimensional array >>> It has nested matrices inside it 

In [5]:
import numpy as np
three_arr = np.array([[[1, 1, 0, 1], [2, 3, 2, 2]], [[1, 4, 5, 1], [6, 2, 3, 6]]])
print(three_arr)

[[[1 1 0 1]
  [2 3 2 2]]

 [[1 4 5 1]
  [6 2 3 6]]]


Let us have some look on the Array Attributes >>>

1) .shape >> It gives the readings of (rows, cols) for a given array.
2) .ndim >> It gives the number of dimensions of the array.
3) .size >> It gives the number of elements stored in the array.
4) .dtype >> It gives the type of the data an array consist.

In [6]:
import numpy as np
arr1 = np.array([[1, 2, 3, 4], [10, 20, 30, 40]])
arr2 = np.array(["house", "cat", "jar", "monkey"])
print(f"Array 1: \n{arr1}\nArray 2: {arr2}")
print(f"Shape of Array 1: {arr1.shape}\nShape of Array 2: {arr2.shape}")
print(f"Size of Array 1: {arr1.size}\nSize of Array 2: {arr2.size}")
print(f"No. of dimensions of Array 1: {arr1.ndim}\nNo. of dimensions of Array 2: {arr2.ndim}")
print(f"Datatype of Array 1: {arr1.dtype}\nDatatype of Array 2: {arr2.dtype}")

Array 1: 
[[ 1  2  3  4]
 [10 20 30 40]]
Array 2: ['house' 'cat' 'jar' 'monkey']
Shape of Array 1: (2, 4)
Shape of Array 2: (4,)
Size of Array 1: 8
Size of Array 2: 4
No. of dimensions of Array 1: 2
No. of dimensions of Array 2: 1
Datatype of Array 1: int64
Datatype of Array 2: <U6


### Data-types operations on array >>>

1. array_name = np.array([...], dtype = 'data-type') >> It is used to assign the data-type of the given array.
2. new_array_name.astype('...') >> It keeps the original data unchanged while acting as a copy making the conversion data-type happen in it.

In [7]:
import numpy as np 
list = np.array([1, 0, 0, 1, -1, 100, 6, 0.4], dtype = 'bool')

print(list)    #returns 'true' for any other value except for 0 and returns 'false' for 0..
print(list.dtype)

[ True False False  True  True  True  True  True]
bool


In [8]:
import numpy as np
mat = np.array([[1, 2, 3, 4], [11, 12, 13, 14]])  
print(mat)
print(mat.dtype)   #returns int as a data-type as it contains integer elements.

new_mat = mat.astype('S')   #astype() converts the data type into a specific type.
print(new_mat)
print(new_mat.dtype)    #returns string as a data-type as it is converted from int.

[[ 1  2  3  4]
 [11 12 13 14]]
int64
[[b'1' b'2' b'3' b'4']
 [b'11' b'12' b'13' b'14']]
|S21


### Indexing >> 

**It allows to print a specific data in an array. As NumPy arrays are zero-indexed, we can also print a certain row or a column.**

You can access any data in an array through indexing.
array_name[first_row][second_row][third_row][nth_row...]

Below code explains the importance of indexing >>

In [9]:
import numpy as np
arr1 = np.array([[1, 2, 3, 4], [10, 20, 30, 40]])
print(arr1)
print(arr1[0])       #returns the first row elements
print(arr1[0, 0])    #returns the first element of the first row
print(arr1[0][2])    #returns '30' located at 1st row(0th index) and 3rd column(2nd index)

[[ 1  2  3  4]
 [10 20 30 40]]
[1 2 3 4]
1
3


*Indexing also helps us with arithmetic operations of elements in the array or the operation of arrays itself >>*

In [10]:
#Operation on elements of the list.
import numpy as np
array_ = np.array([20, 40, 60, 80])
sum = 0
print(array_)
print("Summation of all elements in the array: ")
for i in array_:
    sum += i
print(sum)

[20 40 60 80]
Summation of all elements in the array: 
200


In [11]:
#Operation on arrays itself.
import numpy as np
array1 = np.array([[20, 40, 60], [2, 4, 6], [1, 2, 3]])
array2 = np.array([[10, 30, 50], [1, 3, 5], [4, 5, 6]])
array3 = np.array([[0, 0, 0], [0, 0, 0], [0, 0, 0]])
row_1 = len(array1)
row_2 = len(array2)
cols_1 = len(array1[0])
cols_2 = len(array2[0])
print(f"First matrix:\n{array1}\n\nSecond matrix:\n{array2}\n")
print(f"Addition:\n{array1 + array2}\n")
print(f"Subtraction:\n{array1 - array2}\n")

if cols_1 == row_1:
    for i in range(len(array1)):
        for j in range(len(array2[0])):
            for k in range(len(array2)):
                array3[i][j] += array1[i][k] * array2[k][j]
    print(f"Multiplication:\n{array3}\n")
else:
    print("Matrix Multiplication is not possible!")

First matrix:
[[20 40 60]
 [ 2  4  6]
 [ 1  2  3]]

Second matrix:
[[10 30 50]
 [ 1  3  5]
 [ 4  5  6]]

Addition:
[[ 30  70 110]
 [  3   7  11]
 [  5   7   9]]

Subtraction:
[[10 10 10]
 [ 1  1  1]
 [-3 -3 -3]]

Multiplication:
[[ 480 1020 1560]
 [  48  102  156]
 [  24   51   78]]



### Slicing >> It is a method in python that allows to print a certain specified part of the data stored in any variable to be printed. 

#### slice over print >>

slicing preferred more in numpy than print as it allows a specified part or the whole part to be printed. It takes only a few arguments than print function to perform a task.

#### Slicing on 1-D array >>>

In [12]:
import numpy as np
arr1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

#slicing whole array >>>
print(arr1[:])

#slicing specific part >>>
print(arr1[2: 6])    #returns from 3 to 6
print(arr1[0::2])    #returns odd numbers

#Negative Indexing >>> 
print(arr1[-2])      #returns second last element
print(arr1[-1: :-1]) #returns the array in reverse manner

[ 1  2  3  4  5  6  7  8  9 10]
[3 4 5 6]
[1 3 5 7 9]
9
[10  9  8  7  6  5  4  3  2  1]


#### slicing a 2-D array >>>

In [13]:
import numpy as np
arr2 = np.array([[3, 6, 9, 12, 15, 18, 21], [4, 8, 12, 16, 20, 24, 28]])

print(arr2[1, 1:5])    #returns the elements of second row from 2nd element to 5th.
print(arr2[0: 2, 2])    #returns the third elements from the row as well as the column.

#reverse slicing >>>
print(arr2[:, -1])    #returns the last elements of both rows and cols.
print(arr2[-1:: -1, -1:: -1])   #returns result as reverse also the 1st and 2nd rows are interchanged.
print(arr2[-1:: -1])   #returns the result as it but 1st and 2nd rows are interchanged.

[ 8 12 16 20]
[ 9 12]
[21 28]
[[28 24 20 16 12  8  4]
 [21 18 15 12  9  6  3]]
[[ 4  8 12 16 20 24 28]
 [ 3  6  9 12 15 18 21]]


### Slice vs Copy vs View >>

1. Slice creates a view(ref to the original data) and not a copy. It does not use extra memory storage. Modifying the sliced array directly modifies the original array.
2. Copy creates a copy of an original array. It uses a duplicate memory that is slow to process than the slice. Modifying the copy or the original array do not affect either of the arrays. 
3. View creates a copy of an original array. Modifying the view or the original array affects both simultaneously. 

In [14]:
#copy() in numpy..

import numpy as np
array_first = np.array([12, 24, 36, 48, 60])

#creating copy of the original array.
array_second = array_first.copy()

print("Before changing any array: ")
print(array_first)
print(array_second)

#Changing the copy array element
array_second[0] = 13

#Elements printed for two arrays shows that a copy made of the original array and the changes made
#in either of the arrays do not affect the other array..

print("After changing the copy of the original array: ")
print(array_first)
print(array_second)

#now changing the original array.
array_first[0] = 11

print("After changing the original array itself: ")
print(array_first)
print(array_second)

Before changing any array: 
[12 24 36 48 60]
[12 24 36 48 60]
After changing the copy of the original array: 
[12 24 36 48 60]
[13 24 36 48 60]
After changing the original array itself: 
[11 24 36 48 60]
[13 24 36 48 60]


In [15]:
#view() in numpy..

import numpy as np
array_first = np.array([12, 24, 36, 48, 60])

#creating view of the original array.
array_second = array_first.view()

print("Before changing any array: ")
print(array_first)
print(array_second)

#Changing the view array element
array_second[0] = 13

#Elements printed for two arrays shows that a copy made of the original array and the changes made
#in either of the arrays affects the other array too..

print("After changing the view of the original array: ")
print(array_first)
print(array_second)

#now changing the original array.
array_first[0] = 11

print("After changing the original array itself: ")
print(array_first)
print(array_second)

Before changing any array: 
[12 24 36 48 60]
[12 24 36 48 60]
After changing the view of the original array: 
[13 24 36 48 60]
[13 24 36 48 60]
After changing the original array itself: 
[11 24 36 48 60]
[11 24 36 48 60]


### .base function helps know the user if that array owns the data of the original array >>

In [16]:
import numpy as np
og_array = np.array([1, 2, 3, 4, 5])
copy_array = og_array.copy()
view_array = og_array.view()

print(f"Original array = {og_array}")
print(f"Copy of the original array = {copy_array}")
print(f"View of the original array = {og_array}")

print(f"Owned data by Copy array = {copy_array.base}")   #returns 'None' as it does not own any data.
print(f"Owned data by View array = {view_array.base}")   #returns the data of the original array as it owns it.

Original array = [1 2 3 4 5]
Copy of the original array = [1 2 3 4 5]
View of the original array = [1 2 3 4 5]
Owned data by Copy array = None
Owned data by View array = [1 2 3 4 5]


### Reshaping an array >>> 

We have seen how .shape returns the value (rows, columns, length, ...) of the array.
Now we will see how reshaping an array works. 

1. Bulging a flattened array >> 

In [17]:
import numpy as np
s_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])
print(s_array) 
print(s_array.reshape(3, 3))   #remember that dimensions in the .reshape(..., ..., ...) should match the length of the elements in the array unless it will throw an error...

#9 elemets can be stacked into 3 rows and 3 columns forming 3x3 matrix.
#One can use -1 as an unknown dimension atmost one time in the arguments.. 
print(s_array.reshape(3, -1))   #python will automatically understand the remaining and correct dimension.
print("Checking if the array is a copy or the view: ")
print(s_array.reshape(3, 3).base)
print("Thus, it owns the data making it a view.")

[1 2 3 4 5 6 7 8 9]
[[1 2 3]
 [4 5 6]
 [7 8 9]]
[[1 2 3]
 [4 5 6]
 [7 8 9]]
Checking if the array is a copy or the view: 
[1 2 3 4 5 6 7 8 9]
Thus, it owns the data making it a view.


2. Flattening a bulged array >>

In [18]:
import numpy as np
p_array = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(p_array) 
print(p_array.reshape(-1)) 
print("Checking if the array is a copy or the view: ")
print(s_array.reshape(-1).base)
print("Thus, it owns the data making it a view.")

[[1 2 3]
 [4 5 6]
 [7 8 9]]
[1 2 3 4 5 6 7 8 9]
Checking if the array is a copy or the view: 
[1 2 3 4 5 6 7 8 9]
Thus, it owns the data making it a view.


## Vectorized Operations >>>

### Element-wise Operations >>
In this Operation one does not have to use the loop to operate on all the elements of an array. One can simply operate on array itself and numpy directly operates the  value to all its elements. Thus, rather than using loops in the list or any iterables, one can simply use numpy that can help access all the elements at once.

In [19]:
import numpy as np
vec_array = np.array([10, 20, 30, 40, 50])

print(vec_array)

#Operating Elememt-wise.
add = vec_array + 10
print(f"Adding 10 to array: {add}")

sub = vec_array - 5
print(f"Subtracting 5 from array: {sub}")

mul = vec_array * 3
print(f"Multiplying 3 to the array: {mul}")

div = np.round(vec_array / 7, 2)   #Basically a function in numpy allowing to access two decimal places.
print(f"Dividing array by 4: {div}")

floor = vec_array // 8
print(f"Floor division of the array by 8: {floor}")

modulo = vec_array % 9
print(f"Remainder of array by division by 9: {modulo}")

power = vec_array ** 2
print(f"Exponentiation of array by power 2: {power}")

[10 20 30 40 50]
Adding 10 to array: [20 30 40 50 60]
Subtracting 5 from array: [ 5 15 25 35 45]
Multiplying 3 to the array: [ 30  60  90 120 150]
Dividing array by 4: [1.43 2.86 4.29 5.71 7.14]
Floor division of the array by 8: [1 2 3 5 6]
Remainder of array by division by 9: [1 2 3 4 5]
Exponentiation of array by power 2: [ 100  400  900 1600 2500]


### Array-to-array Operations >>
It as well allows us to operate on two or more different arrays or any iterables without creating a loop in the program. The numpy library is reliable, since each and every element on each of the arrays operates index wise and so forms no error computing it unless manual error occurs.
The trick here is that both the arrays supposed to have same shape as per dimensions. This is thus faster than using loops as the numerical computing factors like constants, vector instructions, etc works magically faster for numpy than for the loop.

In [20]:
import numpy as np
first_array = np.array([100, 200, 300, 400, 500])
second_array = np.array([7, 16, 27, 39, 55])

#Operations on two arrays: 

print(f"Addition of both arrays : {first_array + second_array}")

print(f"Subtraction of first by second : {first_array - second_array}")

print(f"Multiplication of both arrays : {first_array * second_array}")

print(f"Division of first array by second : {np.round((first_array / second_array), 2)}")

print(f"Floor division of first array by second : {first_array // second_array}")

print(f"Modulo of both arrays : {first_array % second_array}")

Addition of both arrays : [107 216 327 439 555]
Subtraction of first by second : [ 93 184 273 361 445]
Multiplication of both arrays : [  700  3200  8100 15600 27500]
Division of first array by second : [14.29 12.5  11.11 10.26  9.09]
Floor division of first array by second : [14 12 11 10  9]
Modulo of both arrays : [ 2  8  3 10  5]


### Comparison Operations >>
It is the comparison between the two arrays or of the array by any constant. Compared to the list in python, it is significantly faster, easier to modify, and to access.

In [21]:
import numpy as np
vec1_array = np.array([10, 20, 30, 40, 50])
vec2_array = np.array ([10, 22, 30, 40, 55])

#Element-wise as well as array-to-array...

print(vec1_array > 10)
print(vec2_array < 20)

print(vec1_array >= vec2_array)
print(vec1_array <= vec2_array)

print(vec1_array == vec2_array)
print(vec1_array != vec2_array)

[False  True  True  True  True]
[ True False False False False]
[ True False  True  True False]
[ True  True  True  True  True]
[ True False  True  True False]
[False  True False False  True]


### Basic Arithmetic Operation functions by numpy library >>

**(operation takes place index/element-wise)**

1. np.add() >> adds up two arrays.
2. np.subtract() >> subtracts second array by first.
3. np.multiple() >> returns product of two arrays.
4. np.divide() >> divides the first array by second.
5. np.mod() or np.remainder() >> returns the remainder after the division.
6. np.divmod() >> returns the quotient and remainder separately after the division.
7. np.power() >> returns an array to corresponding powers in int x1 format.
8. np.float_power() >> returns an array to corresponding powers and also promotes to float64 precision.
9. np.absolute() >> returns the absolute int value whether it is negative or positive.

In [22]:
#Operations on 2D array.
import numpy as np
one_arr = np.array([10, -5, 2, -3, 70]) 
two_arr = np.array([12, 9, 6, 5, 1])

print(one_arr)
print(two_arr)

#Arithmetic operations of 1 D array using functions. 
print("Operations: ")
print(np.add(one_arr, two_arr))

print(np.subtract(one_arr, two_arr))

print(np.multiply(one_arr, two_arr))

print(np.divide(one_arr, two_arr))

print(np.mod(one_arr, two_arr))    #or np.remainder() gives same result.

print(np.divmod(one_arr, two_arr))

print(np.power(one_arr, two_arr))

print(np.float_power(one_arr, two_arr))

print(np.absolute(one_arr))

[10 -5  2 -3 70]
[12  9  6  5  1]
Operations: 
[22  4  8  2 71]
[ -2 -14  -4  -8  69]
[120 -45  12 -15  70]
[ 0.83333333 -0.55555556  0.33333333 -0.6        70.        ]
[10  4  2  2  0]
(array([ 0, -1,  0, -1, 70]), array([10,  4,  2,  2,  0]))
[1000000000000      -1953125            64          -243            70]
[ 1.000000e+12 -1.953125e+06  6.400000e+01 -2.430000e+02  7.000000e+01]
[10  5  2  3 70]


#### Still there are a lot more functions that cannot be covered in a single code. They refer to Logarithmic, Exponential, Matrix_operations, LCM, GCD, Trigonometric, Hyperbolic, Set_operations, Summations, Products and differences and many more... This functions help us explore our fields of career in different ways...

### Aggregation Functions >>>

1. np.sum() - It adds all the elements in the array.
2. np.cumsum() - It gives the array with cumulative addition of elements.
3. np.diff() - It subtracts all the elements in the array.
4. np.prod() - It returns product of all the elements.
5. np.cumprod() - It gives the array with cumulative product of elements.
6. np.mean() - Returns float value of mean of all the elements in the array.
7. np.median() - Returns the middle float value in the array.
7. np.max() - Returns the maximum valued element.
8. np.min() - Returns the minimum valued element.

#### Functions on 1D Array >>

In [23]:
import numpy as np
og_array = np.array([1, 2, 3, 4, 5, 6])

print(np.sum(og_array))
print(np.cumsum(og_array))
print(np.diff(og_array))
print(np.prod(og_array))
print(np.cumprod(og_array))
print(np.mean(og_array))
print(np.median(og_array))
print(np.max(og_array))
print(np.min(og_array))

21
[ 1  3  6 10 15 21]
[1 1 1 1 1]
720
[  1   2   6  24 120 720]
3.5
3.5
6
1


#### Functions on 2D Array >>

*Remember at axis = 0 -> it operates column-wise, at axis = 1 it operates row-wise and at axis = 2(only if the array has the minimum dimension 3) it operates separately for each innermost arrays and for nth dimensional array axis = 0 acts for outermost array, while axis = n - 1 acts for innermost arrays*

In [24]:
import numpy as np
og_array1 = np.array([[1, 2, 3], [4, 5, 6]])

print(np.sum(og_array1))
print(np.sum(og_array1, axis = 0))
print(np.sum(og_array1, axis = 1))
print("\n")
print(np.cumsum(og_array1))
print(np.cumsum(og_array1, axis = 0))
print(np.cumsum(og_array1, axis = 1))
print("\n")
print(np.diff(og_array1))
print(np.diff(og_array1, axis = 0))
print(np.diff(og_array1, axis = 1))
print("\n")
print(np.prod(og_array1))
print(np.prod(og_array1, axis = 0))
print(np.prod(og_array1, axis = 1))
print("\n")
print(np.cumprod(og_array1))
print(np.cumprod(og_array1, axis = 0))
print(np.cumprod(og_array1, axis = 1))
print("\n")
print(np.mean(og_array1))
print(np.mean(og_array1, axis = 0))
print(np.mean(og_array1, axis = 1))
print("\n")
print(np.median(og_array1))
print(np.median(og_array1, axis = 0))
print(np.median(og_array1, axis = 1))
print("\n")
print(np.max(og_array1))
print(np.max(og_array1, axis = 0))
print(np.max(og_array1, axis = 1))
print("\n")
print(np.min(og_array1))
print(np.min(og_array1, axis = 0))
print(np.min(og_array1, axis = 1))

21
[5 7 9]
[ 6 15]


[ 1  3  6 10 15 21]
[[1 2 3]
 [5 7 9]]
[[ 1  3  6]
 [ 4  9 15]]


[[1 1]
 [1 1]]
[[3 3 3]]
[[1 1]
 [1 1]]


720
[ 4 10 18]
[  6 120]


[  1   2   6  24 120 720]
[[ 1  2  3]
 [ 4 10 18]]
[[  1   2   6]
 [  4  20 120]]


3.5
[2.5 3.5 4.5]
[2. 5.]


3.5
[2.5 3.5 4.5]
[2. 5.]


6
[4 5 6]
[3 6]


1
[1 2 3]
[1 4]


## Mini Task 

### 1D Array

In [25]:
import numpy as np

#Creating am array name 'data'...
data = np.array([10, 20, 30, 40, 50])
print(data)
#Creating a var to store mean value of the array 'data'...
mean_value = np.mean(data)

#Creating a view of the array 'data' that stores the value of the updated array...
subtracted_array = data - mean_value
print(subtracted_array)

#here a series of negative to positive integers are formed inside the mean subtracted array...
#this shows the importance of the mean subtraction and shows how it is common in data preprocessing...
#It centers the middle element of the data around 'zero', called as 'mean centering'...

[10 20 30 40 50]
[-20. -10.   0.  10.  20.]


### 2D Array

In [26]:
import numpy as np

matrix = np.array([[10, 20, 30], 
                   [40, 50, 60]])
print(matrix)

row_wise_sum = np.sum(matrix, axis = 1)   #axis = 1 returns the sum of all the rows in the array...
print(row_wise_sum)

column_wise_sum = np.sum(matrix, axis = 0)    #axis = 0 returns the sum of all the columns in the array...
print(column_wise_sum)

[[10 20 30]
 [40 50 60]]
[ 60 150]
[50 70 90]


#### *What's the biggest difference between Python lists and NumPy arrays?*

##### Difference : 
**Performance :** 
Python requires loops to print all the elements in the list, meanwhile NumPy allows the whole array to be printed with a single line of code. Although both approaches have the same time complexity O(n), NumPy is significantly faster in practice and more memory-efficient due to contiguous storage and low-level optimizations.


**Vectorization :**
Python needs loops and assignment operators(+=, -=, *=, etc..,) to operator on a list/matrix either by a constant or another list/matrix, whereas for the arrays in NumPy, you can operate over an another array or a constant directly using arirthmetic operators(+, -, *, etc..,) or by using the built-in NumPy functions, enabling concise, readable, and high-performance numerical computation.