# Numpy

NumPy is a fundamental library for scientific computing in Python. It provides support for arrays and matrices, along with a collection of mathematical functions to operate on these data structures. In this lesson, we will cover the basics of NumPy, focusing on arrays and vectorized operations.

In [3]:
import numpy as np

# Create 1D Array 

In [7]:
arr = np.array([1,2,3,4,5])
print(arr)
print(type(arr))
print(arr.shape)

[1 2 3 4 5]
<class 'numpy.ndarray'>
(5,)


# Reshaping an Array .reshape()
Reshaping means converting a 1D array to 2D, 2D to 3D, etc., or changing the shape of a multi-dimensional array to another shape with the same number of elements.

In [11]:
arr2 = np.array([1,2,3,4,5])
arr2.reshape(1,5) # 1 Row 5 columns 

array([[1, 2, 3, 4, 5]])

In [18]:
# 2D Array 
# Using nested lists 
arr3 = np.array([[1,2,3,4,5],[2,3,4,5,6]])
print(arr3.shape)
print(arr3)

(2, 5)
[[1 2 3 4 5]
 [2 3 4 5 6]]


# Using .arange()
np.arange() is used to generate numeric sequences as NumPy arrays. It's similar to Python's range(), but more flexible and efficient for numerical computation. It’s useful in simulations, indexing, plotting, and initializing arrays for further processing."

In [43]:
np.arange(0,10,1).reshape(10,1)

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

# Using .ones() 
The np.ones() function is used to create an array filled entirely with the number 1, of a specified shape and data type.

"np.ones() is used to create an array of any shape filled with the value 1. It’s commonly used for initialization in numerical computations, simulations, or as a starting point for algorithms that update values iteratively. It allows quick allocation of memory for predictable constant arrays."

In [22]:
arr_of_ones = np.ones((3,4)) 
print(arr_of_ones)

[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]


# Using .eye()

The np.eye() function creates a 2D identity matrix or a matrix with ones on a specified diagonal and zeros elsewhere.

In [25]:
identity_matrix = np.eye(3)
print(identity_matrix)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


# Attributes of Numpy Array

In [26]:
arr = np.array([[1,2,3] ,[4,5,6]])

print("Array : \n ",arr)
print("Shape: ",arr.shape) 
print("Number of dimensions:" , arr.ndim)
print("Size (number of elements): ",arr.size) 
print("Data type:",arr.dtype)
print("Item size (in bytes):",arr.itemsize)

Array : 
  [[1 2 3]
 [4 5 6]]
Shape:  (2, 3)
Number of dimensions: 2
Size (number of elements):  6
Data type: int64
Item size (in bytes): 8


# Numpy Vectorized Operations
“Vectorized operations in NumPy allow us to apply computations across entire arrays without writing loops. This not only makes code more readable and concise but also significantly improves performance, as the operations are internally executed in compiled C code. It's a core reason why NumPy is preferred in scientific and machine learning applications.”

Vectorized operations are operations that are automatically applied element-wise to entire arrays without writing explicit loops.

Instead of processing data one element at a time (like with Python for loops), NumPy does it in one go, using underlying C code, which makes it much faster.

In [28]:
arr1 = np.array([1,2,3,4,5]) 
arr2 = np.array([10,20,30,40,50])

## Element wise addition 
print("Addition : ", arr1+arr2)

# Element wise substraction
print("Substraction : ",arr1-arr2) 

# ELement wise multiplication 
print("Multiplication:" , arr1*arr2) 

# Element wise division
print("Division:", arr1/arr2)

Addition :  [11 22 33 44 55]
Substraction :  [ -9 -18 -27 -36 -45]
Multiplication: [ 10  40  90 160 250]
Division: [0.1 0.1 0.1 0.1 0.1]


# Universal Functions 
Functions that apply to entire array. A universal function (or ufunc) in NumPy is a function that operates element-wise on ndarrays (NumPy arrays) and supports broadcasting, type casting, and vectorized operations.

In [None]:
arri = np.array([2,3,4,5,6 ]) 

# Square root 
print(np.sqrt(arri)) 

# Exponential 
print(np.exp(arri)) 

# Sin 
print(np.sin(arri))

# Natural log 
print(np.log(arri))

[1.41421356 1.73205081 2.         2.23606798 2.44948974]
[  7.3890561   20.08553692  54.59815003 148.4131591  403.42879349]
[ 0.90929743  0.14112001 -0.7568025  -0.95892427 -0.2794155 ]
[0.69314718 1.09861229 1.38629436 1.60943791 1.79175947]


# Array slicing and Indexing 

In [6]:
import numpy as np 
arrz= np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print("Array : \n",arrz)

Array : 
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [7]:
# From the above array access element 7 
print(arrz[1][2]) # Output : 7

7


In [23]:
'''Array : 
[[  1  2  3  4  ]
 [  5  6  7  8  ]
 [  9 10 11 12  ]]'''

# Print 1 & 2 rows 
print("1st and 2nd Rows : \n" ,arrz[0:2]) # 2nd row will not be included

print(" ")

# Access 2 & 3 rows
print("2nd and 3rd rows : \n",arrz[1:])

print(" ")

# Print 7 8 11 12 , look in arrz matrix above 
# 7, 8 is in row 1 (indexing start from 0) and 11 , 12 is in row 2 
# 7,8 is in col 2 and 3 , 
print("Print 7 , 8 , 11 , 12 : \n",arrz[1:,2:])

print(" ")
    # [0:2] means print 0 and 1 rows , 
    # 2: means print from 2 column till last column
print("Print 3 ,4 ,7 ,8 : \n" , arrz[0:2,2:])

print(" ") 


# arrz[1: ] means print from 1 till second last rows  
  # 1:3 means print 1 and 2 column
print("6 , 7 , 10 , 11 : \n" , arrz[1:,1:3])



1st and 2nd Rows : 
 [[1 2 3 4]
 [5 6 7 8]]
 
2nd and 3rd rows : 
 [[ 5  6  7  8]
 [ 9 10 11 12]]
 
Print 7 , 8 , 11 , 12 : 
 [[ 7  8]
 [11 12]]
 
Print 3 ,4 ,7 ,8 : 
 [[3 4]
 [7 8]]
 
6 , 7 , 10 , 11 : 
 [[ 6  7]
 [10 11]]


Array : 
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


# Modifying array elements 


In [25]:
import numpy as np 
arrx= np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
print("Array : \n",arrx)

Array : 
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]


In [27]:
# Modify arrx[0]0[0] to 100 
arrx[0][0] = 100
print(arrx)

[[100   2   3   4]
 [  5   6   7   8]
 [  9  10  11  12]]


In [29]:
# Modifying Multiple values 
# change the values of rows 1 and 2 to 100
arrx[1:] = 100
print(arrx)

[[100   2   3   4]
 [100 100 100 100]
 [100 100 100 100]]


# Real World Data Science Applications 

## 1. Normalization
Statistical concept - Normalization , to have mean = 0 and standard deviation = 1; 


In [None]:
data = np.array([1,2,3,4,5]) 

#Calculate the meand and standard deviation 


mean = np.mean(data)
std_dev = np.std(data) 

#
# Normalizing the data 
normalized_data = (data - mean)/std_dev
print("Original Data : ") 
print(data) 
print(" ")
print("Normalized Data: ") 
for values in normalized_data: 
    print(values)

Original Data : 
[1 2 3 4 5]
 
Normalized Data: 
-1.414213562373095
-0.7071067811865475
0.0
0.7071067811865475
1.414213562373095


In [34]:
data2 = np.array([1,2,3,4,5])

# Mean 
mean = np.mean(data) 
print("Mean : ",mean)

# Median 
median = np.median(data) 
print("Median : ",median)

# Standard Deviation 
std_dev = np.std(data) 
print("Standard Deviation : " , std_dev) 

# Variance 
variance = np.var(data) 
print("Variance : ",variance)


Mean :  3.0
Median :  3.0
Standard Deviation :  1.4142135623730951
Variance :  2.0


# Logical Operations (Pandas )

In [43]:
data3 = np.array([1,2,3,4,5,6,7,8,9,10]) 
mydata = data3[data3>5]
print(mydata)

# Multiple conditions 
data3[(data3>=5 ) & (data3<=8)]

[ 6  7  8  9 10]


array([5, 6, 7, 8])