<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Introduction 
</h1>

**Author: Samanyu**<br>
**Language: Python**<br>
**Accelerator: None**<br>
**Goal: To help beginners grasp the fundamentals of NumPy more easily.**


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Introduction to NumPy 🔢
</h1>

## What is NumPy?

**NumPy (Numerical Python) is a powerful open-source library used for numerical computations in Python. It provides high-performance multidimensional arrays and tools to work with them efficiently.**<br>

**Key features of NumPy:**

* Supports N-dimensional arrays (ndarray) for efficient storage and manipulation of large datasets.
* Offers vectorized operations, making calculations faster compared to Python lists.Includes mathematical, statistical, and linear algebra functions.
* Facilitates broadcasting, allowing operations on arrays of different shapes.
* Provides tools for integration with C, C++, and Fortran, making it highly efficient for performance-critical applications.

## Importance of NumPy in Data Science and Machine Learning

**NumPy is a fundamental library in Python for data science and machine learning for the following reasons:**


* Efficient Data Handling: Faster than Python lists due to optimized C-based backend.
* Core of Scientific Computing: Used in libraries like Pandas, Scikit-learn, TensorFlow, PyTorch, etc.
* Easy Mathematical Computation: Supports operations like matrix multiplication, eigenvalues, and Fourier transforms.
* Memory Efficiency: Uses contiguous memory blocks, reducing overhead.
* Data Preprocessing: Helps in feature scaling, normalization, and missing value handling in ML pipelines.

## Installation of NumPy


In [8]:
# To install Numpy run this cell
!pip install numpy



In [9]:
# To verify the installation run this cell
import numpy as np
print(np.__version__)

1.26.4


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    NumPy Arrays
</h1>

## Creating a NumPy Array (np.array())

1. NumPy arrays are the core data structure in NumPy. They are faster, more efficient, and more powerful than Python lists for numerical computations. Let's dive deep into different aspects of NumPy arrays, step by step.<br>
2. A NumPy array is created using the np.array() function. Unlike Python lists, NumPy arrays support vectorized operations and consume less memory.

In [10]:
# Creating a 1D Array
# Syntax -> arr1=np.array([Array elements seperated by a comma])
arr1=np.array([1,2,3,4,5]) #This creates a one-dimensional array (1D array).
print(arr1) # printing arr1

[1 2 3 4 5]


In [11]:
# Creating a 2D Array (Matrix)
# Syntax -> arr2=np.array([[1st Row elements seperated by a comma],[2nd Row elements seperated by a comma],.....])
arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # Creates a 2 X 3 Matrix
print(arr2)

[[1 2 3]
 [4 5 6]]


In [12]:
# Creating a 3D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #This creates a three-dimensional array (3D array), often used in deep learning and image processing.
print(arr3)

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


## Array Data Types (dtype)
NumPy automatically assigns a data type to the array based on the values inside. We can check the data type using the .dtype attribute.

In [14]:
# Checking data types
arr = np.array([1, 2, 3])
print(arr.dtype)

int64


In [15]:
# Explicitly Defining Data Type
# We can specify a data type when creating an array
arr_float = np.array([1, 2, 3], dtype=np.float32)
print(arr_float)
print(arr_float.dtype)

[1. 2. 3.]
float32


In [17]:
# Converting data type
arr_int = np.array([1.5, 2.9, 3.2], dtype=int)
print(arr_int)  
# In the above example It truncates the decimal part, not rounds.

[1 2 3]


**Common NumPy Data Types**<br>
<table border="1" style="border-collapse: collapse; width: 100%; text-align: left;">
    <tr style="background-color: black; color: white;">
        <th>Data Type</th>
        <th>Description</th>
    </tr>
    <tr>
        <td>int32 / int64</td>
        <td>Integer values (32-bit or 64-bit)</td>
    </tr>
    <tr>
        <td>float32 / float64</td>
        <td>Floating-point values (decimal numbers)</td>
    </tr>
    <tr>
        <td>bool</td>
        <td>Boolean values (True, False)</td>
    </tr>
    <tr>
        <td>complex128</td>
        <td>Complex numbers (e.g., 1+2j)</td>
    </tr>
    <tr>
        <td>str / U</td>
        <td>Strings</td>
    </tr>
</table>


## Shape and Dimension (ndim, shape, size)<br>
<table border="1" style="border-collapse: collapse; width: 100%; text-align: left;">
    <tr style="background-color: black; color: white;">
        <th>Attribute</th>
        <th>Purpose</th>
    </tr>
    <tr>
        <td>.ndim</td>
        <td>Returns the number of dimensions (1D, 2D, 3D, etc.)</td>
    </tr>
    <tr>
        <td>.shape</td>
        <td>Returns a tuple representing the shape (rows, columns)</td>
    </tr>
    <tr>
        <td>.size</td>
        <td>Returns the total number of elements</td>
    </tr>
</table>


In [18]:
# Checking the Dimenrsions of an array
arr1 = np.array([1, 2, 3]) # 1D Array
arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #3D Array

print(arr1.ndim)  
print(arr2.ndim)  
print(arr3.ndim)  

1
2
3


In [20]:
# Checking shape of the array
arr1 = np.array([1, 2, 3]) # 1D Array
arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #3D Array

print(arr1.shape)  # (3,) -> 1 row, 3 columns
print(arr2.shape)  # (2, 3) -> 2 rows, 3 columns
print(arr3.shape)  # (2, 2, 2) -> 2 matrices of size (2x2)

(3,)
(2, 3)
(2, 2, 2)


In [22]:
# Checking the size of the array
arr1 = np.array([1, 2, 3]) # 1D Array
arr2 = np.array([[1, 2, 3], [4, 5, 6]]) # 2D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) #3D Array

print(arr1.size)  # shape[0] = 3
print(arr2.size)  # shape[0]*shape[1] = 2*3 = 6
print(arr3.size)  # shape[0]*shape[1]*shape[2] = 2*2*2 = 8

3
6
8


## Reshaping and Flattening Arrays
1. We can change the shape of an array using .reshape().
2. ⚠️Condition: The total number of elements must remain the same.
3. To convert a multi-dimensional array into a 1D array, use .flatten().

In [25]:
arr = np.array([1, 2, 3, 4, 5, 6]) # 1D Array of 6 elements
reshaped_arr = arr.reshape(2, 3)  # Reshaping to 2 rows, 3 columns
print(reshaped_arr)

[[1 2 3]
 [4 5 6]]


In [26]:
arr = np.array([[1, 2, 3], [4, 5, 6]])
flat_arr = arr.flatten() # Flattening the above 2 X 3 Array
print(flat_arr) 

[1 2 3 4 5 6]


**Other methods include .ravel() and .reshape(-1)**
1. .ravel() is similar to .flatten() but returns view if possible
2. .reshape(-1) is an alternative way to flatten

## Copy vs View in NumPy
**In NumPy, whether a modification affects the original array depends on whether it is a copy or a view.**<br>
1. Copy (New Memory Allocation): A copy creates a completely new object, independent of the original array.
2. View (Reference to Original Memory): A view shares the same memory as the original array.

In [27]:
# New Memory Allocation
arr = np.array([1, 2, 3])
copy_arr = arr.copy()

copy_arr[0] = 100
print(arr)       # original array unchanged
print(copy_arr)  # copy modified

[1 2 3]
[100   2   3]


In [29]:
# View
arr = np.array([1, 2, 3])
view_arr = arr.view()

view_arr[0] = 100
print(arr)       # original array modified
print(view_arr)  # view modified

[100   2   3]
[100   2   3]


In [30]:
# Checking if an Array is a View or Copy
print(copy_arr.base)  # this is a copy
print(view_arr.base)  # this is not a copy

None
[100   2   3]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Array Creation Techniques in NumPy 🪄🪄
</h1>

**NumPy provides multiple ways to create arrays efficiently. Instead of manually defining arrays with np.array(), we can use specialized functions that generate arrays with specific values or patterns.**

## Creating Arrays with Default Values

In [37]:
# np.zeros() – Array of Zeros
# Creates an array filled with zeroes. The default data type is float64.
arr_zeros = np.zeros((3, 4))  # 3 rows, 4 columns
print(arr_zeros)

# You can specify the dtype if needed
arr_zeros_int = np.zeros((3, 4), dtype=int)
print('\n',arr_zeros_int)

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

 [[0 0 0 0]
 [0 0 0 0]
 [0 0 0 0]]


In [38]:
# np.ones() – Array of Ones
# Creates an array filled with ones.
arr_ones = np.ones((2, 3))  # 2 rows, 3 columns
print(arr_ones)
#Changing dtype
arr_ones_int = np.ones((2, 2), dtype=int)
print('\n',arr_ones_int)

[[1. 1. 1.]
 [1. 1. 1.]]

 [[1 1]
 [1 1]]


In [42]:
# np.full() – Custom Constant Array
# Creates an array filled with a specific value.
arr_full = np.full((3, 3), 20)  # 3x3 array filled with 20
print(arr_full)

# Let us do one with float data
arr_full_float = np.full((2, 2), 20.28) # 2x2 array filled with 20.28
print('\n',arr_full_float)

[[20 20 20]
 [20 20 20]
 [20 20 20]]

 [[20.28 20.28]
 [20.28 20.28]]


## Creating Arrays with a Sequence of Numbers

In [49]:
# np.arange() – Similar to range() in Python
# Generates evenly spaced values within a given range.
# Works like range(start, stop, step), but returns an array.

arr_arange = np.arange(1, 10, 2)  # Start=1, Stop=10, Step=2
print(arr_arange)

# Let us do one for even numbers from 0 to 10
arr_even_arange=np.arange(0,10,2)
print(arr_even_arange)

[1 3 5 7 9]
[0 2 4 6 8]


In [50]:
# np.linspace() – Evenly Spaced Numbers in a Range
# Creates an array with linearly spaced values between two numbers.
arr_linspace = np.linspace(0, 100, 5)  # 5 evenly spaced values between 0 and 100
print(arr_linspace)

[  0.  25.  50.  75. 100.]


## Identity and Diagonal Matrices

In [52]:
# np.eye() – Identity Matrix
# Creates a square matrix with 1s on the diagonal and 0s elsewhere.
identity_matrix = np.eye(4)  # 4x4 identity matrix
print(identity_matrix)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]


In [55]:
# np.diag() – Diagonal Matrix
# Creates a matrix with custom values on the diagonal.
diag_matrix = np.diag([20, 28, 8]) # Diagonal values
print(diag_matrix)

[[20  0  0]
 [ 0 28  0]
 [ 0  0  8]]


In [56]:
# Extracting the diagonal of an existing matrix
existing_matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(np.diag(existing_matrix))

[1 5 9]


## Generating Random Arrays (np.random Module)

In [57]:
# np.random.rand() – Uniform Distribution (0 to 1)
# Generates random numbers from a uniform distribution between 0 and 1.
random_uniform = np.random.rand(3, 3)  # 3x3 matrix with random values
print(random_uniform)

[[0.93549071 0.11923294 0.93419789]
 [0.71275239 0.53146358 0.98016409]
 [0.68993954 0.56558326 0.571778  ]]


In [59]:
# np.random.randint() – Random Integers
# Generates random integers between a given range.
random_ints = np.random.randint(1, 100, (3, 3))  # 3x3 matrix with values between 1-100
print(random_ints)

[[79 21 10]
 [ 8 91  2]
 [55 47 81]]


In [60]:
# np.random.randn() – Normal Distribution (Mean = 0, Std = 1)
# Generates random numbers from a standard normal distribution (bell-shaped curve).
random_normal = np.random.randn(4, 4)  # 4x4 matrix
print(random_normal)

[[-1.74806459 -0.00775581  0.3739465  -0.2581857 ]
 [ 0.20171798  1.40572831 -1.10450465  0.31896326]
 [-0.39778142  0.4871744   0.78349397 -0.0093619 ]
 [ 0.95527542  0.29136886  0.42888506 -0.88772245]]


In [67]:
# np.random.choice() – Random Selection from a List
# Chooses random elements from a given list.
choices = np.random.choice([10, 20, 30, 40], size=(2, 2))
print(choices)
choices1 = np.random.choice([108, 203, 308, 50,100,120,150], size=(2, 2))
print('\n',choices1)

[[10 40]
 [20 30]]

 [[100 150]
 [150 108]]


In [69]:
# np.random.seed() – Fixing Randomness for Reproducibility
# Setting a seed ensures the same random numbers are generated every time.
np.random.seed(42)
print(np.random.rand(3))

[0.37454012 0.95071431 0.73199394]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Indexing and Slicing in NumPy 📇🔪
</h1>

**Indexing and slicing allow us to access and manipulate specific elements, rows, and columns of a NumPy array efficiently. This section covers:**
 
1. Basic Indexing (1D, 2D, and 3D arrays)
2. Slicing Arrays (: operator)
3. Boolean Indexing
4. Fancy Indexing (Advanced Indexing)

## Basic Indexing in NumPy
Indexing is used to access elements in an array. NumPy follows zero-based indexing, meaning the first element has an index of 0.

In [70]:
# Indexing in 1D Arrays
arr = np.array([10, 20, 30, 40, 50])
print(arr[0])  # First element
print(arr[2])  # Third element
print(arr[-1]) # Last element

10
30
50


In [72]:
# Indexing in 2D Arrays (Rows and Columns)
# For 2D arrays, use row index and column index (arr[row, column]).
arr_2d = np.array([[10, 20, 30], 
                   [40, 50, 60], 
                   [70, 80, 90]])

print(arr_2d[0, 1])  # First row, second column (20)
print(arr_2d[2, 2])  # Third row, third column (90)
print(arr_2d[-1, -1]) # Last row, last column (90)

20
90
90


In [73]:
# Indexing in 3D Arrays
# A 3D array is structured as (depth, row, column).
arr_3d = np.array([[[1, 2, 3], [4, 5, 6]], 
                   [[7, 8, 9], [10, 11, 12]]])

print(arr_3d[0, 1, 2])  # First matrix, second row, third column (6)
print(arr_3d[1, 0, 1])  # Second matrix, first row, second column (8)

6
8


## Slicing Arrays (: operator)
**Slicing retrieves specific parts of an array using the syntax :**<br>
👉 arr[start:stop:step] → Extracts elements from start to stop-1 with a step size.

In [74]:
# Slicing 1D Arrays
arr = np.array([10, 20, 30, 40, 50, 60, 70])

print(arr[1:5])   # Elements from index 1 to 4 → [20, 30, 40, 50]
print(arr[:4])    # First 4 elements → [10, 20, 30, 40]
print(arr[3:])    # Elements from index 3 onwards → [40, 50, 60, 70]
print(arr[::2])   # Every second element → [10, 30, 50, 70]
print(arr[::-1])  # Reverse the array → [70, 60, 50, 40, 30, 20, 10]

[20 30 40 50]
[10 20 30 40]
[40 50 60 70]
[10 30 50 70]
[70 60 50 40 30 20 10]


In [77]:
# Slicing 2D Arrays
# For 2D arrays, slicing follows: arr[row_start:row_end, col_start:col_end]
arr_2d = np.array([[10, 20, 30], 
                   [40, 50, 60], 
                   [70, 80, 90]])

print(arr_2d[:2, :2])   # First two rows and first two columns
print('\n',arr_2d[1:, 1:])   # Last two rows and last two columns
print('\n',arr_2d[:, 1])     # All rows, second column

[[10 20]
 [40 50]]

 [[50 60]
 [80 90]]

 [20 50 80]


In [78]:
# Slicing 3D Arrays
arr_3d = np.array([[[1, 2, 3], [4, 5, 6]], 
                   [[7, 8, 9], [10, 11, 12]]])

print(arr_3d[:, 1, :])  # All matrices, second row
print('\n',arr_3d[:, :, 2])  # All matrices, last column

[[ 4  5  6]
 [10 11 12]]

 [[ 3  6]
 [ 9 12]]


## Boolean Indexing (Conditional Filtering) 

In [79]:
# Boolean indexing is used to filter elements based on a condition.
arr = np.array([10, 20, 30, 40, 50])

bool_mask = arr > 25  # Condition: Values greater than 25
print(bool_mask)       # Boolean array
print(arr[bool_mask])  # Extract elements that meet the condition

[False False  True  True  True]
[30 40 50]


In [81]:
# Let us do one with Multiple conditions using the above array
print(arr[(arr > 20) & (arr < 50)])  # Values between 20 and 50 → [30, 40]
print(arr[(arr == 10) | (arr == 50)]) # Values 10 OR 50 → [10, 50]

[30 40]
[10 50]


## Fancy Indexing (Advanced Indexing)
Fancy indexing allows non-continuous element selection using arrays of indices.

In [82]:
# Fancy Indexing in 1D Arrays
arr = np.array([10, 20, 30, 40, 50])

indices = [0, 2, 4]  # Selecting elements at index 0, 2, and 4
print(arr[indices])


[10 30 50]


In [84]:
# Fancy Indexing in 2D Arrays
arr_2d = np.array([[10, 20, 30], 
                   [40, 50, 60], 
                   [70, 80, 90]])

row_indices = [0, 2]
col_indices = [1, 2]

print(arr_2d[row_indices, col_indices])  # (0,1) and (2,2)

# Fancy Indexing with Repetition for 2D Arrays
print('\n',arr_2d[[0, 0, 1], [1, 2, 0]])  # Select (0,1), (0,2), (1,0)

[20 90]

 [20 30 40]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Array Operations in NumPy ⚙️🔦🔨
</h1>

**NumPy allows efficient and fast operations on arrays due to its vectorized computation. This section covers:**<br>
1. Element-wise Arithmetic Operations (+, -, *, /)
2. Broadcasting
3. Vectorized Operations (np.add(), np.multiply())
4. Mathematical Functions (np.sin(), np.log(), np.exp())

## Element-wise Arithmetic Operations

In [87]:
# Addition
import numpy as np

a = np.array([1, 2, 10])
b = np.array([4, 5, 6])

print(a + b)  # Element-wise addition

[ 5  7 16]


In [88]:
# Subtraction
print(a - b)  # Element-wise subtraction

[-3 -3  4]


In [89]:
# Multiplication
print(a * b)  # Element-wise multiplication

[ 4 10 60]


In [92]:
# Division 
print(b / a)  # Element-wise division

# ⚠️ If division by zero occurs, NumPy returns inf (infinity) or nan (not a number).
c = np.array([1, 0, 3])
print(a / c)  # Warning: Division by zero

[4.  2.5 0.6]
[1.                inf 3.33333333]




In [93]:
#  Modulus (%) and Power (**)
print(b % a)   # Element-wise modulus (remainder)
print(a ** 2)  # Element-wise exponentiation (square)

[0 1 6]
[  1   4 100]


##  Broadcasting (Operating on Different Sized Arrays)

In [94]:
# Broadcasting with Scalars
arr = np.array([1, 2, 3])

print(arr + 5)   # Add 5 to each element
print(arr * 2)   # Multiply each element by 2

[6 7 8]
[2 4 6]


In [96]:
# Broadcasting with Different Sized Arrays
# If arrays have different shapes, NumPy automatically expands the smaller array to match the larger one.
A = np.array([[1, 2, 3], [4, 5, 6]])  # Shape (2,3)
B = np.array([10, 20, 30])            # Shape (3,)

print(A + B)  # B is broadcasted to match A's shape
# NumPy expands B from (3,) to (2,3) internally yaaaaay!!!!

[[11 22 33]
 [14 25 36]]


## Vectorized Operations (np.add(), np.multiply())

In [97]:
# Using np.add() and np.subtract()
print(np.add(a, b))    # Equivalent to a + b
print(np.subtract(b, a))  # Equivalent to b - a

[ 5  7 16]
[ 3  3 -4]


In [98]:
# Using np.multiply() and np.divide()
print(np.multiply(a, b))  # Equivalent to a * b
print(np.divide(b, a))    # Equivalent to b / a

[ 4 10 60]
[4.  2.5 0.6]


In [99]:
# Using np.power() and np.mod()
print(np.power(a, 2))  # Square each element
print(np.mod(b, a))    # Remainder when b is divided by a

[  1   4 100]
[0 1 6]


## Mathematical Functions in NumPy
**NumPy provides built-in mathematical functions that operate element-wise.**

In [101]:
# Trigonometric Functions (np.sin(), np.cos(), np.tan())
angles = np.array([0, np.pi/2, np.pi])

print(np.sin(angles))  # Sine values
print(np.cos(angles))  # Cosine values
print(np.tan(angles))  # Tangent values

# ⚠️Angles must be in Radians for the above functions

[0.0000000e+00 1.0000000e+00 1.2246468e-16]
[ 1.000000e+00  6.123234e-17 -1.000000e+00]
[ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]


In [102]:
# Rounding and Absolute Value
arr = np.array([-3.7, 2.9, -1.2, 4.5])

print(np.abs(arr))      # Absolute value
print(np.round(arr))    # Round to nearest integer
print(np.floor(arr))    # Round down
print(np.ceil(arr))     # Round up

[3.7 2.9 1.2 4.5]
[-4.  3. -1.  4.]
[-4.  2. -2.  4.]
[-3.  3. -1.  5.]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Aggregations and Statistical Functions in NumPy 📊
</h1>

**NumPy provides efficient aggregation and statistical functions that allow us to summarize data easily. This section covers:**<br>

1. Summation (np.sum())
2. Mean (np.mean()) and Median (np.median())
3. Standard Deviation (np.std()) and Variance (np.var())
4. Minimum (np.min()) and Maximum (np.max())
5. Index of Min (np.argmin()) and Index of Max (np.argmax())


## Summation: np.sum()

In [103]:
# Summing a 1D Array
arr = np.array([1, 2, 3, 4, 5])
print(np.sum(arr))  # Sum of all elements

15


In [104]:
# Summing Along an Axis (2D Array)
# axis=0 → Operates vertically (column-wise)
# axis=1 → Operates horizontally (row-wise)
matrix = np.array([[1, 2, 3], 
                   [4, 5, 6]])

print(np.sum(matrix, axis=0))  # Sum along columns
print(np.sum(matrix, axis=1))  # Sum along rows

[5 7 9]
[ 6 15]


## Mean: np.mean()
**The mean (average) is the sum of elements divided by the number of elements.**

In [105]:
arr = np.array([1, 2, 3, 4, 5])
print(np.mean(arr))  # Average of array elements

3.0


In [106]:
# Mean along an axis using the previously defined 2D matrix
print(np.mean(matrix, axis=0))  # Column-wise mean
print(np.mean(matrix, axis=1))  # Row-wise mean

[2.5 3.5 4.5]
[2. 5.]


## Median: np.median()
**The median is the middle value when numbers are sorted.**

In [107]:
arr2 = np.array([1, 3, 2, 5, 4])
print(np.median(arr2))  # Middle value after sorting: [1, 2, 3, 4, 5] → 3

3.0


In [108]:
# Median along the axis
print(np.median(matrix, axis=0))  # Column-wise median
print(np.median(matrix, axis=1))  # Row-wise median

[2.5 3.5 4.5]
[2. 5.]


 ## Standard Deviation: np.std()
 **The standard deviation measures how much values vary from the mean.**<br>
 **Formula:**<br>
 ![image.png](attachment:346c1380-de11-4f21-8b3c-3b7b17bd3568.png)<br>
 1. Higher standard deviation = More spread out values
 2. Lower standard deviation = Values are closer to the mean

In [109]:
print(np.std(arr))  # Standard deviation of the array

1.4142135623730951


## Variance: np.var()
**Variance measures the spread of data (square of standard deviation).**

In [110]:
print(np.var(arr))  # Variance of the array

2.0


In [111]:
# Variance along an axis
print(np.var(matrix, axis=0))  # Column-wise variance
print(np.var(matrix, axis=1))  # Row-wise variance

[2.25 2.25 2.25]
[0.66666667 0.66666667]


## Minimum and Maximum: np.min() and np.max()

In [112]:
print(np.min(arr))  # Minimum value
print(np.max(arr))  # Maximum value

1
5


In [113]:
# Min/Max Along an Axis
print(np.min(matrix, axis=0))  # Minimum values column-wise
print(np.max(matrix, axis=1))  # Maximum values row-wise

[1 2 3]
[3 6]


## Index of Minimum and Maximum: np.argmin() and np.argmax()

In [114]:
print(np.argmin(arr))  # Index of min value
print(np.argmax(arr))  # Index of max value

0
4


In [115]:
# Min/Max Index Along an Axis
print(np.argmin(matrix, axis=0))  # Indices of min values column-wise
print(np.argmax(matrix, axis=1))  # Indices of max values row-wise

[0 0 0]
[2 2]


## Summary of Aggregation and Statistical Functions
<table border="1" style="border-collapse: collapse; width: 100%; text-align: left;">
    <tr style="background-color: black; color: white;">
        <th>Function</th>
        <th>Description</th>
    </tr>
    <tr>
        <td>np.sum()</td>
        <td>Sum of all elements</td>
    </tr>
    <tr>
        <td>np.mean()</td>
        <td>Mean (average)</td>
    </tr>
    <tr>
        <td>np.median()</td>
        <td>Median (middle value)</td>
    </tr>
    <tr>
        <td>np.std()</td>
        <td>Standard deviation (spread of data)</td>
    </tr>
    <tr>
        <td>np.var()</td>
        <td>Variance (square of standard deviation)</td>
    </tr>
    <tr>
        <td>np.min()</td>
        <td>Minimum value in an array</td>
    </tr>
    <tr>
        <td>np.max()</td>
        <td>Maximum value in an array</td>
    </tr>
    <tr>
        <td>np.argmin()</td>
        <td>Index of the minimum value</td>
    </tr>
    <tr>
        <td>np.argmax()</td>
        <td>Index of the maximum value</td>
    </tr>
</table>


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Working with Missing Values in NumPy 🤕
</h1>

**Missing values are common in real-world datasets. NumPy provides powerful functions to detect, handle, and replace NaN (Not a Number) values efficiently.**

## Identifying Missing Values (np.isnan())
**The function np.isnan() checks for NaN (Not a Number) values and returns a boolean mask (True for NaN, False otherwise).**

In [116]:
arr = np.array([1, 2, np.nan, 4, np.nan, 6])
print(np.isnan(arr))  # Check for NaN values

[False False  True False  True False]


In [117]:
# An example use case: You can filter out missing values using this function.
print(arr[~np.isnan(arr)])  # Remove NaNs

[1. 2. 4. 6.]


## Calculating without NaN Values

In [118]:
# Replacing NaNs with a Default Value (np.nan_to_num())
# np.nan_to_num() replaces NaN values with a specified value or 0 by default.
arr_filled = np.nan_to_num(arr, nan=-1)  # Replace NaN with -1
print(arr_filled)

[ 1.  2. -1.  4. -1.  6.]


In [119]:
# Computing Mean Ignoring NaNs (np.nanmean())
# The function np.nanmean() calculates the mean ignoring NaN values.
print(np.nanmean(arr))  # Mean without NaNs

3.25


In [120]:
# Finding Max/Min Ignoring NaNs (np.nanmax(), np.nanmin())
print(np.nanmax(arr))  # Maximum ignoring NaNs
print(np.nanmin(arr))  # Minimum ignoring NaNs

6.0
1.0


## Filling Missing Values

**Why use mean?**<br>

1. Useful for normally distributed data.
2. Helps maintain the central tendency of data.


In [122]:
#  Filling with Mean (np.nanmean())
# Replacing NaN values with the mean of the array ensures that the overall distribution remains unchanged.

mean_value = np.nanmean(arr)  # Compute mean ignoring NaNs
arr_filled_mean = np.where(np.isnan(arr), mean_value, arr)  # Replace NaNs with mean
print(arr_filled_mean)

[1.   2.   3.25 4.   3.25 6.  ]


**Why use median?**


1. More robust to outliers than mean.
2. Works well for skewed datasets.



In [124]:
# Filling with Median (np.nanmedian())
# Replacing NaNs with the median is useful when the data contains outliers.

median_value = np.nanmedian(arr)  # Compute median ignoring NaNs
arr_filled_median = np.where(np.isnan(arr), median_value, arr)  # Replace NaNs with median

print(arr_filled_median)

[1. 2. 3. 4. 3. 6.]


**Why use mode?**

1. Useful for categorical data (e.g., replacing missing city names in a dataset).
2. Works well when one value is dominant in the dataset.

In [126]:
# Filling with Mode
# The mode is the most frequently occurring value in the dataset. NumPy does not have a direct np.nanmode(), but we can use SciPy to find the mode.
from scipy import stats

arr_with_nan = np.array([1, 2, 2, np.nan, 3, 2, np.nan, 4])
mode_value = stats.mode(arr_with_nan, nan_policy='omit').mode.item()  # Compute mode ignoring NaNs
arr_filled_mode = np.where(np.isnan(arr_with_nan), mode_value, arr_with_nan)

print(arr_filled_mode)

[1. 2. 2. 2. 3. 2. 2. 4.]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Sorting and Searching in NumPy 🔍 
</h1>

**Sorting and searching are crucial for data organization, ranking, and retrieval. NumPy provides fast and efficient functions for sorting arrays and searching for elements.**

## Sorting Arrays (np.sort())

⚠️ Sorting an array means arranging its elements in ascending order by default.<br>
👉 Note: np.sort() does not modify the original array.

In [127]:
# Sorting 1D Array
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6])
sorted_arr = np.sort(arr)
print(sorted_arr)


[1 1 2 3 4 5 6 9]


In [129]:
# Sorting a 2D Array
# Sorting can be done along rows (axis=1) or columns (axis=0).

matrix = np.array([[3, 2, 1], 
                   [9, 5, 7]])

print(np.sort(matrix, axis=0))  # Sort along columns
print('\n',np.sort(matrix, axis=1))  # Sort along rows


[[3 2 1]
 [9 5 7]]

 [[1 2 3]
 [5 7 9]]


## Sorting with Index Positions (np.argsort())

np.argsort() returns the indices that would sort the array.

In [131]:
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6])
indices = np.argsort(arr)
print(indices)  # Index positions of sorted elements

[1 3 6 0 2 4 7 5]


In [132]:
names = np.array(["John", "Alice", "Bob", "David"])
scores = np.array([85, 92, 78, 90])

sorted_indices = np.argsort(scores)
sorted_names = names[sorted_indices]

print(sorted_names)  # Names sorted by scores

['Bob' 'John' 'David' 'Alice']


## Searching for Elements in an Array
NumPy provides efficient ways to search for values within an array.

In [133]:
# Finding Specific Values (np.where())
# np.where(condition) returns the indices where the condition is True.
arr = np.array([10, 20, 30, 40, 50, 60])
indices = np.where(arr > 30)  # Find indices where values > 30
print(indices)

(array([3, 4, 5]),)


In [134]:
filtered_values = arr[np.where(arr > 30)] #Filtering the values based on the condition
print(filtered_values)

[40 50 60]


In [136]:
# Finding Non-Zero Elements (np.nonzero())
# Returns indices of non-zero elements.
arr = np.array([0, 3, 0, 4, 5, 0])
indices = np.nonzero(arr)
print(indices)
print(arr[indices])

(array([1, 3, 4]),)
[3 4 5]


## Summary
<table border="1" style="border-collapse: collapse; width: 100%; text-align: left;">
    <tr style="background-color: black; color: white;">
        <th>Function</th>
        <th>Purpose</th>
    </tr>
    <tr>
        <td>np.sort(arr)</td>
        <td>Sort array (ascending)</td>
    </tr>
    <tr>
        <td>np.sort(arr, axis=0/1)</td>
        <td>Sort 2D array along columns/rows</td>
    </tr>
    <tr>
        <td>np.argsort(arr)</td>
        <td>Get indices of sorted elements</td>
    </tr>
    <tr>
        <td>np.where(condition)</td>
        <td>Find indices where a condition is met</td>
    </tr>
    <tr>
        <td>np.nonzero(arr)</td>
        <td>Get indices of non-zero elements</td>
    </tr>
</table>


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Some Advanced NumPy Functions 👩‍💻
</h1>

**These functions help with specialized operations such as limiting values, finding unique elements, and performing cumulative computations.**

## Clipping Values (np.clip())
**Clipping means restricting array values within a specified range. If an element is lower than the minimum value, it's set to the min. If it's higher than the max, it's set to the max.**

In [137]:
arr = np.array([1, 5, 10, 15, 20])
clipped_arr = np.clip(arr, 5, 15)  # Values <5 become 5, values >15 become 15

print(clipped_arr)

[ 5  5 10 15 15]


## Finding Unique Elements (np.unique())
**Returns the unique values in an array, useful for removing duplicates.**

In [138]:
arr = np.array([1, 2, 2, 3, 3, 3, 4, 5, 5])
unique_values = np.unique(arr)

print(unique_values)

[1 2 3 4 5]


## Cumulative Operations (np.cumsum(), np.cumprod())
**Cumulative sum and cumulative product generate running totals and running products respectively.**

In [141]:
# Cumulative Sum (np.cumsum())
arr = np.array([1, 2, 3, 4])
cumsum_arr = np.cumsum(arr)  # Running sum
print(cumsum_arr)

[ 1  3  6 10]


In [142]:
# Cumulative Product (np.cumprod())
cumprod_arr = np.cumprod(arr)  # Running product
print(cumprod_arr)

[ 1  2  6 24]


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Linear Algebra with NumPy  👓
</h1>

**These functions help with specialized operations such as limiting values, finding unique elements, and performing cumulative computations.**

## Matrix Multiplication (np.dot(), @ operator)
**Matrix multiplication is different from element-wise multiplication (*). Use np.dot() or @ to perform dot product multiplication.**

In [143]:
A = np.array([[1, 2], 
              [3, 4]])
B = np.array([[5, 6], 
              [7, 8]])
result = np.dot(A, B)  # OR A @ B

print(result)

[[19 22]
 [43 50]]


## Determinant of a Matrix (np.linalg.det())
**The determinant helps in understanding invertibility and solving linear equations.**

In [144]:
A = np.array([[3, 4], 
              [2, 1]])
det_A = np.linalg.det(A)
print(det_A)

-5.000000000000001


## Inverse of a Matrix (np.linalg.inv())
**The inverse of a matrix is useful for solving linear equations (AX = B → X = A⁻¹B).**

In [145]:
A = np.array([[3, 4], 
              [2, 1]])
inv_A = np.linalg.inv(A)
print(inv_A)

[[-0.2  0.8]
 [ 0.4 -0.6]]


## Eigenvalues & Eigenvectors (np.linalg.eig())
**Eigenvalues and eigenvectors are fundamental in dimensionality reduction (PCA) and data transformation.**

In [146]:
A = np.array([[4, -2], 
              [1, 1]])

eigenvalues, eigenvectors = np.linalg.eig(A)

print("Eigenvalues:", eigenvalues)
print("Eigenvectors:\n", eigenvectors)

Eigenvalues: [3. 2.]
Eigenvectors:
 [[0.89442719 0.70710678]
 [0.4472136  0.70710678]]


## Summary of the above Functions
<table border="1" style="border-collapse: collapse; width: 100%; text-align: left;">
    <tr style="background-color: black; color: white;">
        <th>Function</th>
        <th>Description</th>
    </tr>
    <tr>
        <td>np.dot(A, B) or A @ B</td>
        <td>Matrix multiplication</td>
    </tr>
    <tr>
        <td>np.linalg.det(A)</td>
        <td>Compute determinant</td>
    </tr>
    <tr>
        <td>np.linalg.inv(A)</td>
        <td>Compute inverse</td>
    </tr>
    <tr>
        <td>np.linalg.eig(A)</td>
        <td>Compute eigenvalues & eigenvectors</td>
    </tr>
</table>


<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Conclusion: Mastering NumPy as a Beginner 🥳🪄
</h1>

**Congratulations on reaching the end of this NumPy Beginner’s Guide! 🎉 You have covered a wide range of essential concepts, from basic array creation to advanced operations like linear algebra and statistical functions.** <br>
**By now, you should be comfortable with:**
1. Creating and manipulating NumPy arrays
2. Performing mathematical and statistical computations efficiently
3. Indexing, slicing, and filtering data
4. Working with missing values and sorting/searching arrays
5. Understanding NumPy’s role in data science and machine learning
6. And many more...... 🥳🥳

<h1 style="background-color: black; color: white; padding: 40px; text-align: center;">
    Feedback and Suggestions 🥺
</h1>

**Kindly provide feedback and suggestions to improve this notebook**<br>
**If you liked this notebook or you found it helpful kindly upvote 👍and share :)**<br>
**Please do point out if I have not covered an important beginner topic or if I have messed up somewhere**<br>
**ALL THE BEST 👍🥳**