<a href="https://colab.research.google.com/github/mirzanaeembeg/Python-Cheat-Sheet-ML-DL-AI/blob/main/2_NumPy_Numerical_Computing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Complete Python Cheat Sheet for Machine Learning, Deep Learning & AI

## Table of Contents
1. [Python Fundamentals](https://colab.research.google.com/drive/1linKYA8PHgnMb4ugYkClIWu0_7SdfLtk#1-python-fundamentals=)
2. [NumPy - Numerical Computing](https://colab.research.google.com/drive/1qZFirXOdQtbtfCdJPtT9RU-FshLo9qLH?usp=sharing)
3. [Pandas - Data Manipulation](https://colab.research.google.com/drive/18QZJEVNTCqfHAATjvYZZy4e-gcDmKpMk)
4. [Matplotlib & Seaborn - Data Visualization](#4-matplotlib--seaborn---data-visualization)
5. [Scikit-learn - Machine Learning](#5-scikit-learn---machine-learning)
6. [TensorFlow & Keras - Deep Learning](#6-tensorflow--keras---deep-learning)
7. [PyTorch - Deep Learning](#7-pytorch---deep-learning)
8. [Data Preprocessing](#8-data-preprocessing)
9. [Model Evaluation & Metrics](#9-model-evaluation--metrics)
10. [Advanced Topics](#10-advanced-topics)
11. [Best Practices](#11-best-practices)
12. [Resources & Further Learning](#12-resources--further-learning)

* * *


# 2. NumPy - Numerical Computing
---
---
## Table of Contents
2.  [NumPy - Numerical Computing](#2-numpy---numerical-computing)
    *   [2.1 Import NumPy](#21-import-numpy)
    *   [2.2 Array Creation](#22-array-creation)
        *   [Basic Array Creation](#basic-array-creation)
        *   [Built-in Array Creation Functions](#built-in-array-creation-functions)
        *   [Range and Random Arrays](#range-and-random-arrays)
    *   [2.3 Array Properties and Information](#23-array-properties-and-information)
    *   [2.4 Array Indexing and Slicing](#24-array-indexing-and-slicing)
        *   [Basic Indexing](#basic-indexing)
        *   [Multi-dimensional Indexing](#multi-dimensional-indexing)
        *   [Advanced Indexing](#advanced-indexing)
    *   [2.5 Array Operations](#25-array-operations)
        *   [Arithmetic Operations](#arithmetic-operations)
        *   [Mathematical Functions](#mathematical-functions)
    *   [2.6 Vectorization and Array Operations](#26-vectorization-and-array-operations)
    *   [2.7 Data Type Considerations](#27-data-type-considerations)
    *   [2.8 Statistical Operations](#28-statistical-operations)
    *   [2.9 Array Manipulation](#29-array-manipulation)
        *   [Reshaping](#reshaping)
        *   [Joining and Splitting](#joining-and-splitting)
        *   [Adding and Removing Elements](#adding-and-removing-elements)
    *   [2.10 Linear Algebra](#210-linear-algebra)
    *   [2.11 Broadcasting](#211-broadcasting)
    *   [2.12 Conditional Operations](#212-conditional-operations)
    *   [2.13 Sorting and Searching](#213-sorting-and-searching)
    *   [2.14 Set Operations](#214-set-operations)
    *   [2.15 File I/O](#215-file-io)
    *   [2.16 Memory and Performance Tips](#216-memory-and-performance-tips)
    *   [2.17 Common Patterns in ML/AI](#217-common-patterns-in-mlai)
    *   [2.18 Performance Optimization](#218-performance-optimization)

<div align="center">
<img src="https://drive.google.com/uc?id=1JOzNG8Sf9rXQ_nHqlzOvzOnpr4iSIkmc" width="250">

* * *
* * *
This section of the notebook provides a comprehensive introduction to [Numpy](https://numpy.org/), the fundamental library for numerical computing in Python. It covers the core concepts and functionalities of NumPy, including:

- **Array Creation**: Learn various methods to create NumPy arrays, from basic lists to built-in functions for generating arrays filled with zeros, ones, or random values, as well as arrays with sequences using `arange` and `linspace`.
- **Array Properties and Information**: Understand how to access important attributes of NumPy arrays such as shape, size, dimension, data type, and memory usage.
- **Array Indexing and Slicing**: Master the techniques for accessing and manipulating elements within arrays using basic, multi-dimensional, and advanced indexing (boolean and fancy indexing) and slicing.
- **Array Operations**: Explore element-wise arithmetic operations, mathematical functions, and scalar operations that can be efficiently applied to NumPy arrays.
- **Vectorization and Array Operations**: Understand NumPy's ability to apply operations to entire arrays without explicit loops, including vectorized operations and broadcasting.
- **Data Type Considerations**: Learn about NumPy's automatic data type inference, explicit data type specification, and type conversion.
- **Statistical Operations**: Learn how to perform common statistical calculations on arrays, including mean, median, standard deviation, variance, min, max, sum, and operations along specific axes.
- **Array Manipulation**: Discover functions for reshaping, joining (concatenation, vstack, hstack), splitting, adding, and removing elements from arrays.
- **Linear Algebra**: Delve into essential linear algebra operations like matrix multiplication, transpose, inverse, determinant, trace, eigenvalues, eigenvectors, solving linear systems, and Singular Value Decomposition (SVD).
- **Broadcasting**: Understand NumPy's powerful broadcasting mechanism that allows operations between arrays of different shapes under certain rules.
- **Conditional Operations**: Learn how to use functions like `where`, `select`, and `clip` to perform operations based on conditions.
- **Sorting and Searching**: Explore methods for sorting arrays and finding specific elements or their indices using functions like `sort`, `argsort`, `searchsorted`, `argmax`, `argmin`, and `nonzero`.
- **Set Operations**: Discover how to perform set operations like finding unique elements, intersection, union, difference, and symmetric difference on NumPy arrays.
- **File I/O**: Learn how to save and load NumPy arrays to and from binary (`.npy`, `.npz`) and text (`.txt`, `.csv`) files.
- **Memory and Performance Tips**: Gain insights into optimizing NumPy code for better memory usage and performance by understanding views vs. copies, using vectorized operations, pre-allocating arrays, and choosing appropriate data types.
- **Common Patterns in ML/AI**: See examples of how NumPy is used to implement common patterns in machine learning and artificial intelligence, such as normalization, min-max scaling, one-hot encoding, train-test split, confusion matrix, softmax, and sigmoid functions.

This section serves as a fundamental guide to using NumPy for efficient and effective numerical computations in Python, essential for tasks in data analysis, scientific computing, and machine learning.

## 2.1 Import NumPy
---

In [None]:
import numpy as np

## 2.2 Array Creation
---

### Basic Array Creation

This section demonstrates how to create NumPy arrays from basic Python lists, nested lists for multi-dimensional arrays, and how to specify the data type during creation using the `dtype` argument. The code block below shows examples of each method.

In [None]:
# From lists
arr = np.array([1, 2, 3, 4, 5])
arr_2d = np.array([[1, 2, 3], [4, 5, 6]])
print("1D array from list:", arr)
print("\n2D array from list:\n", arr_2d)
print("\n")

# From nested lists (multi-dimensional)
arr_nested = np.array([
    [1, 2, 3],
    [2, 3, 5]
])
print("Array from nested lists:\n", arr_nested)
print("\n")

# Specify data type
arr_float = np.array([1, 2, 3], dtype=np.float32)
print("Array with specified dtype (float32):", arr_float)

1D array from list: [1 2 3 4 5]

2D array from list:
 [[1 2 3]
 [4 5 6]]


Array from nested lists:
 [[1 2 3]
 [2 3 5]]


Array with specified dtype (float32): [1. 2. 3.]


### Built-in Array Creation Functions

This section covers built-in NumPy functions for creating arrays with predefined values or structures. The code block below demonstrates the use of functions like `np.zeros()`, `np.ones()`, `np.eye()`, `np.identity()`, `np.full()`, `np.full_like()`, and `np.empty()` for creating arrays filled with zeros, ones, identity matrices, constant values, or uninitialized values.

In [None]:
# Zeros and ones
zeros_1d = np.zeros(5)                  # [0. 0. 0. 0. 0.]
zeros_2d = np.zeros((3, 4))             # 3x4 matrix of zeros
ones_1d = np.ones(5)                    # [1. 1. 1. 1. 1.]
ones_2d = np.ones((2, 3))               # 2x3 matrix of ones

# Identity matrix
identity_3x3 = np.eye(3)                # 3x3 identity matrix
identity_4x4 = np.identity(4)           # 4x4 identity matrix

# Fill with specific value
full_2x3 = np.full((2, 3), 7)           # 2x3 matrix filled with 7s
full_like_arr = np.full_like(arr, 9)    # Array with same shape and type as 'arr', filled with 9s

# Empty arrays
empty_1d = np.empty(5)                  # 1D array with uninitialized (random) values
empty_2x3 = np.empty((2, 3))            # 2x3 matrix with uninitialized (random) values

print("Zeros 1D:\n", zeros_1d)
print("\nZeros 2D:\n", zeros_2d)
print("\nOnes 1D:\n", ones_1d)
print("\nOnes 2D:\n", ones_2d)
print("\nIdentity 3x3:\n", identity_3x3)
print("\nIdentity 4x4:\n", identity_4x4)
print("\nFull 2x3 (filled with 7):\n", full_2x3)
print("\nFull like arr (filled with 9):\n", full_like_arr)
print("\nEmpty 1D:\n", empty_1d)
print("\nEmpty 2x3:\n", empty_2x3)

Zeros 1D:
 [0. 0. 0. 0. 0.]

Zeros 2D:
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]

Ones 1D:
 [1. 1. 1. 1. 1.]

Ones 2D:
 [[1. 1. 1.]
 [1. 1. 1.]]

Identity 3x3:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Identity 4x4:
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

Full 2x3 (filled with 7):
 [[7 7 7]
 [7 7 7]]

Full like arr (filled with 9):
 [9. 9. 9.]

Empty 1D:
 [1. 1. 1. 1. 1.]

Empty 2x3:
 [[3.5e-323 3.5e-323 3.5e-323]
 [3.5e-323 3.5e-323 3.5e-323]]


### Range and Random Arrays

This section explores creating arrays with sequences of numbers and arrays with random values. The code block below illustrates the use of `np.arange()` and `np.linspace()` for generating arrays with specific ranges and intervals, and functions like `np.random.random()`, `np.random.randint()`, `np.random.randn()`, `np.random.normal()`, and `np.random.uniform()` for creating arrays with different types of random distributions. It also shows how to set a random seed for reproducibility.

In [None]:
# Range arrays
arange_10 = np.arange(10)                     # [0 1 2 3 4 5 6 7 8 9]
arange_step = np.arange(2, 10, 2)             # [2 4 6 8]
linspace_0_1 = np.linspace(0, 1, 5)           # [0. 0.25 0.5 0.75 1.]

# Random arrays
random_5 = np.random.random(5)                # Random floats [0, 1)
randint_0_10 = np.random.randint(0, 10, 5)    # Random integers [0, 10)
randint_shape = np.random.randint(1, 101, (2, 2, 4))  # Random integers with shape
randn_3x3 = np.random.randn(3, 3)              # Standard normal distribution
normal_0_1 = np.random.normal(0, 1, 5)         # Normal distribution (mean=0, std=1)
uniform_neg1_1 = np.random.uniform(-1, 1, 5)   # Uniform distribution [-1, 1)
random_scaled = np.random.random((3, 2)) * 10  # Random values scaled by 10

# Set random seed for reproducibility
np.random.seed(42)

print("arange(10):\n", arange_10)
print("\narange(2, 10, 2):\n", arange_step)
print("\nlinspace(0, 1, 5):\n", linspace_0_1)
print("\nrandom(5):\n", random_5)
print("\nrandint(0, 10, 5):\n", randint_0_10)
print("\nrandon(3, 3):\n", randn_3x3)
print("\nnormal(0, 1, 5):\n", normal_0_1)
print("\nuniform(-1, 1, 5):\n", uniform_neg1_1)
print("\nrandom((3, 2)) * 10:\n", random_scaled)

arange(10):
 [0 1 2 3 4 5 6 7 8 9]

arange(2, 10, 2):
 [2 4 6 8]

linspace(0, 1, 5):
 [0.   0.25 0.5  0.75 1.  ]

random(5):
 [0.21166096 0.66188846 0.77461002 0.16320478 0.05844617]

randint(0, 10, 5):
 [9 3 7 7 5]

randon(3, 3):
 [[ 1.34951762  0.082762    1.38209688]
 [ 1.43487508  0.46452473  0.51660896]
 [-1.00594207  1.81924834  0.59422461]]

normal(0, 1, 5):
 [-0.6230445  -0.29667177 -0.53902113  0.66310652  0.27830356]

uniform(-1, 1, 5):
 [ 0.81585103 -0.67492526  0.01677472  0.14994315 -0.10663232]

random((3, 2)) * 10:
 [[4.0641013  5.19103991]
 [3.77796403 8.10860487]
 [9.62372254 1.66370371]]


## 2.3 Array Properties and Information
* * *

This section details how to access and understand the properties and information of NumPy arrays. The code block below demonstrates how to retrieve attributes such as shape, size, dimension, data type, item size, total bytes, flags, and data buffer. It also shows examples of automatic data type inference and explicit type specification/conversion.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Basic properties
arr_shape = arr.shape          # (2, 3) - dimensions
arr_size = arr.size            # 6 - total number of elements
arr_ndim = arr.ndim            # 2 - number of dimensions
arr_dtype = arr.dtype          # data type (int64, float64, etc.)
arr_itemsize = arr.itemsize       # bytes per element
arr_nbytes = arr.nbytes        # total bytes consumed

# Memory layout
arr_flags = arr.flags          # memory layout information
arr_data = arr.data            # buffer containing actual data

# NumPy automatically chooses data types
x = np.array([1, 2])        # dtype: int64
y = np.array([1, 2.0])      # dtype: float64
z = np.array([1, 2], dtype=np.int32)  # dtype: int32

# Type conversion during array creation
arr_float32 = np.array([1, 2.6], dtype=np.float32)  # Forces float32

print("arr.shape:", arr_shape)
print("arr.size:", arr_size)
print("arr.ndim:", arr_ndim)
print("arr.dtype:", arr_dtype)
print("arr.itemsize:", arr_itemsize)
print("arr.nbytes:", arr_nbytes)
print("\n")
print("arr.flags:", arr_flags)
print("arr.data:", arr_data)
print("\n")
print("x:", x, "dtype:", x.dtype)
print("y:", y, "dtype:", y.dtype)
print("z:", z, "dtype:", z.dtype)
print("\n")
print("arr_float32:", arr_float32)

arr.shape: (2, 3)
arr.size: 6
arr.ndim: 2
arr.dtype: int64
arr.itemsize: 8
arr.nbytes: 48


arr.flags:   C_CONTIGUOUS : True
  F_CONTIGUOUS : False
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  WRITEBACKIFCOPY : False

arr.data: <memory at 0x7cabb9772a80>


x: [1 2] dtype: int64
y: [1. 2.] dtype: float64
z: [1 2] dtype: int32


arr_float32: [1.  2.6]


## 2.4 Array Indexing and Slicing
---

### Basic Indexing

This section introduces basic indexing techniques for accessing elements in 1D NumPy arrays. The code block below shows how to access single elements using positive and negative indices and how to use slicing to extract portions of the array, including stepping and reversing.

In [None]:
arr = np.array([0, 1, 2, 3, 4, 5])

# Single element
first_element = arr[0]            # 0
last_element = arr[-1]            # 5 (last element)

# Slicing
slice_1_to_4 = arr[1:4]           # [1 2 3]
every_2nd_element = arr[::2]      # [0 2 4] (every 2nd element)
reversed_arr = arr[::-1]          # [5 4 3 2 1 0] (reverse)

print("First element:", first_element)
print("Last element:", last_element)
print("Slice from index 1 to 4:", slice_1_to_4)
print("Every 2nd element:", every_2nd_element)
print("Reversed array:", reversed_arr)

First element: 0
Last element: 5
Slice from index 1 to 4: [1 2 3]
Every 2nd element: [0 2 4]
Reversed array: [5 4 3 2 1 0]


### Multi-dimensional Indexing

This section explains indexing for multi-dimensional NumPy arrays. The code block below demonstrates how to access individual elements using row and column indices, slice rows and columns, and extract subarrays using slicing on both dimensions.

In [None]:
arr_2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Element access
element_0_1 = arr_2d[0, 1]       # 2
element_1_2 = arr_2d[1][2]       # 6 (alternative syntax)

# Row and column slicing
first_row = arr_2d[0, :]         # [1 2 3] (first row)
second_column = arr_2d[:, 1]     # [2 5 8] (second column)
subarray = arr_2d[0:2, 1:3]      # [[2 3], [5 6]] (subarray)

print("Element at [0, 1]:", element_0_1)
print("Element at [1][2]:", element_1_2)
print("First row:", first_row)
print("Second column:", second_column)
print("Subarray from [0:2, 1:3]:\n", subarray)

Element at [0, 1]: 2
Element at [1][2]: 6
First row: [1 2 3]
Second column: [2 5 8]
Subarray from [0:2, 1:3]:
 [[2 3]
 [5 6]]


### Advanced Indexing

This section covers more advanced indexing methods in NumPy, including boolean indexing and fancy indexing. The code block below shows how to select elements based on a boolean condition (mask) and how to select elements using a list or array of indices. It also includes an example of multi-dimensional boolean indexing.

In [None]:
arr = np.array([10, 20, 30, 40, 50])

# Boolean indexing
mask = arr > 25
arr_mask = arr[mask]                 # [30 40 50]
arr_direct_mask = arr[arr > 25]      # [30 40 50] (direct)

# Fancy indexing
indices = [0, 2, 4]
arr_indices = arr[indices]           # [10 30 50]

# Multi-dimensional boolean indexing
arr_2d = np.random.randint(0, 10, (3, 3))
arr_2d_mask = arr_2d[arr_2d > 5]     # All elements > 5

print("Boolean indexing with mask:", arr_mask)
print("Boolean indexing direct:", arr_direct_mask)
print("Fancy indexing:", arr_indices)
print("Multi-dimensional boolean indexing (elements > 5):", arr_2d_mask)
print("\nOriginal 2D array for multi-dimensional boolean indexing:\n", arr_2d)

Boolean indexing with mask: [30 40 50]
Boolean indexing direct: [30 40 50]
Fancy indexing: [10 30 50]
Multi-dimensional boolean indexing (elements > 5): [7 7 7]

Original 2D array for multi-dimensional boolean indexing:
 [[4 3 7]
 [7 2 5]
 [4 1 7]]


## 2.5 Array Operations
* * *

This section explores various operations that can be performed on NumPy arrays.

### Arithmetic Operations

This section demonstrates element-wise and scalar arithmetic operations on NumPy arrays. The code block below shows examples of addition, subtraction, multiplication, division, exponentiation, and modulus operations between arrays and between an array and a scalar. It also shows the corresponding NumPy functions for these operations.

In [None]:
a = np.array([1, 2, 3, 4])
b = np.array([5, 6, 7, 8])

# Element-wise operations
sum_ab = a + b               # [6 8 10 12]
diff_ab = a - b              # [-4 -4 -4 -4]
prod_ab = a * b              # [5 12 21 32]
div_ab = a / b               # [0.2 0.33 0.43 0.5]
a_squared = a ** 2             # [1 4 9 16]
a_mod_3 = a % 3              # [1 2 0 1]

# Scalar operations
a_plus_10 = a + 10           # [11 12 13 14]
a_times_2 = a * 2            # [2 4 6 8]

# Also available as functions
add_func = np.add(a, b)            # Same as a + b
subtract_func = np.subtract(a, b)  # Same as a - b
multiply_func = np.multiply(a, b)  # Same as a * b
divide_func = np.divide(a, b)      # Same as a / b

print("a + b:", sum_ab)
print("a - b:", diff_ab)
print("a * b:", prod_ab)
print("a / b:", div_ab)
print("a ** 2:", a_squared)
print("a % 3:", a_mod_3)
print("\n")
print("a + 10:", a_plus_10)
print("a * 2:", a_times_2)
print("\n")
print("np.add(a, b):", add_func)
print("np.subtract(a, b):", subtract_func)
print("np.multiply(a, b):", multiply_func)
print("np.divide(a, b):", divide_func)

a + b: [ 6  8 10 12]
a - b: [-4 -4 -4 -4]
a * b: [ 5 12 21 32]
a / b: [0.2        0.33333333 0.42857143 0.5       ]
a ** 2: [ 1  4  9 16]
a % 3: [1 2 0 1]


a + 10: [11 12 13 14]
a * 2: [2 4 6 8]


np.add(a, b): [ 6  8 10 12]
np.subtract(a, b): [-4 -4 -4 -4]
np.multiply(a, b): [ 5 12 21 32]
np.divide(a, b): [0.2        0.33333333 0.42857143 0.5       ]


### Mathematical Functions

This section covers a variety of mathematical functions available in NumPy that can be applied element-wise to arrays. The code block below illustrates the use of basic math functions like `sqrt`, `square`, `abs`, and `sign`, trigonometric functions like `sin`, `cos`, and `tan`, exponential and logarithmic functions like `exp`, `log`, `log10`, and `log2`, and rounding functions like `round`, `floor`, and `ceil`.

In [None]:
arr = np.array([1, 4, 9, 16])

# Basic math
sqrt_arr = np.sqrt(arr)         # [1. 2. 3. 4.]
square_arr = np.square(arr)     # [1 16 81 256]
abs_arr = np.abs(arr)           # Absolute value
sign_arr = np.sign(arr)         # Sign of elements

# Trigonometric
sin_arr = np.sin(arr)
cos_arr = np.cos(arr)
tan_arr = np.tan(arr)

# Exponential and logarithmic
exp_arr = np.exp(arr)           # e^x (exponential function applies element-wise)
log_arr = np.log(arr)           # Natural log
log10_arr = np.log10(arr)       # Base-10 log
log2_arr = np.log2(arr)         # Base-2 log

# Rounding
arr_float = np.array([1.23, 4.56, 7.89])
round_arr_float = np.round(arr_float, 1)    # [1.2 4.6 7.9]
floor_arr_float = np.floor(arr_float)       # [1. 4. 7.]
ceil_arr_float = np.ceil(arr_float)         # [2. 5. 8.]

print("np.sqrt(arr):", sqrt_arr)
print("np.square(arr):", square_arr)
print("np.abs(arr):", abs_arr)
print("np.sign(arr):", sign_arr)
print("\n")
print("np.sin(arr):", sin_arr)
print("np.cos(arr):", cos_arr)
print("np.tan(arr):", tan_arr)
print("\n")
print("np.exp(arr):", exp_arr)
print("np.log(arr):", log_arr)
print("np.log10(arr):", log10_arr)
print("np.log2(arr):", log2_arr)
print("\n")
print("np.round(arr_float, 1):", round_arr_float)
print("np.floor(arr_float):", floor_arr_float)
print("np.ceil(arr_float):", ceil_arr_float)

np.sqrt(arr): [1. 2. 3. 4.]
np.square(arr): [  1  16  81 256]
np.abs(arr): [ 1  4  9 16]
np.sign(arr): [1 1 1 1]


np.sin(arr): [ 0.84147098 -0.7568025   0.41211849 -0.28790332]
np.cos(arr): [ 0.54030231 -0.65364362 -0.91113026 -0.95765948]
np.tan(arr): [ 1.55740772  1.15782128 -0.45231566  0.30063224]


np.exp(arr): [2.71828183e+00 5.45981500e+01 8.10308393e+03 8.88611052e+06]
np.log(arr): [0.         1.38629436 2.19722458 2.77258872]
np.log10(arr): [0.         0.60205999 0.95424251 1.20411998]
np.log2(arr): [0.       2.       3.169925 4.      ]


np.round(arr_float, 1): [1.2 4.6 7.9]
np.floor(arr_float): [1. 4. 7.]
np.ceil(arr_float): [2. 5. 8.]


## 2.6 Vectorization and Array Operations
* * *
NumPy's key strength is vectorization - applying operations to entire arrays without explicit loops.

The code block below demonstrates the efficiency of vectorized operations compared to traditional Python loops. It shows examples of vector-scalar operations, element-wise operations, and vector operations with broadcasting, highlighting how NumPy handles operations on entire arrays efficiently.

In [None]:
# Vectorized operations are much faster than loops
x = np.array([1, 2, 3])

# Vector-scalar operations
result_plus = x + 3          # [4, 5, 6]
result_minus = x - 3         # [-2, -1, 0]
result_divide = 1 / x        # [1.0, 0.5, 0.33...]

# All operations are element-wise by default
# If x = (x1, x2, ..., xn), then:
# np.exp(x) = (e^x1, e^x2, ..., e^xn)
x_exp = np.exp(x)            # [2.718, 7.389, 20.086]

# Vector operations with broadcasting
x_2d = np.array([[1, 2], [3, 4]])
y_1d = np.array([10, 20])
result_broadcast = x_2d + y_1d      # Broadcasting: adds [10, 20] to each row

print("x + 3:", result_plus)
print("x - 3:", result_minus)
print("1 / x:", result_divide)
print("\n")
print("np.exp(x):", x_exp)
print("\n")
print("Broadcasting (x_2d + y_1d):\n", result_broadcast)

x + 3: [4 5 6]
x - 3: [-2 -1  0]
1 / x: [1.         0.5        0.33333333]


np.exp(x): [ 2.71828183  7.3890561  20.08553692]


Broadcasting (x_2d + y_1d):
 [[11 22]
 [13 24]]


## 2.7 Data Type Considerations
* * *

This section focuses on how NumPy handles data types, including automatic inference and explicit specification. The code block below shows how NumPy infers the data type based on the input data, how to explicitly specify the data type using the `dtype` argument during array creation, and how to convert the data type of an existing array using the `.astype()` method.

In [None]:
# NumPy automatically infers data types
arr_int = np.array([1, 2, 3])          # int64
arr_float = np.array([1.0, 2.0, 3.0])  # float64
arr_mixed = np.array([1, 2.0, 3])      # float64 (upcasts to float)

# Explicit data type specification
arr_int32 = np.array([1, 2, 3], dtype=np.int32)
arr_float32 = np.array([1, 2, 3], dtype=np.float32)

# Data type conversion
arr_for_conversion = np.array([1.6, 2.7, 3.8])
arr_int_converted = arr_for_conversion.astype(np.int32)  # [1, 2, 3] - truncates decimals

print("arr_int:", arr_int, "dtype:", arr_int.dtype)
print("arr_float:", arr_float, "dtype:", arr_float.dtype)
print("arr_mixed:", arr_mixed, "dtype:", arr_mixed.dtype)
print("arr_int32:", arr_int32, "dtype:", arr_int32.dtype)
print("arr_float32:", arr_float32, "dtype:", arr_float32.dtype)
print("arr_for_conversion:", arr_for_conversion, "dtype:", arr_for_conversion.dtype)
print("arr_int_converted:", arr_int_converted, "dtype:", arr_int_converted.dtype)

arr_int: [1 2 3] dtype: int64
arr_float: [1. 2. 3.] dtype: float64
arr_mixed: [1. 2. 3.] dtype: float64
arr_int32: [1 2 3] dtype: int32
arr_float32: [1. 2. 3.] dtype: float32
arr_for_conversion: [1.6 2.7 3.8] dtype: float64
arr_int_converted: [1 2 3] dtype: int32


## 2.8 Statistical Operations
* * *

This section covers common statistical operations that can be performed on NumPy arrays. The code block below demonstrates how to calculate basic statistics such as mean, median, standard deviation, variance, minimum, maximum, and sum of array elements. It also shows how to perform these operations along specific axes (rows or columns) and introduces other statistical functions like percentile, quantile, correlation coefficient, and covariance.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Basic statistics
mean_arr = np.mean(arr)      # 5.0
median_arr = np.median(arr)  # 5.0
std_arr = np.std(arr)        # Standard deviation
var_arr = np.var(arr)        # Variance
min_arr = np.min(arr)        # 1
max_arr = np.max(arr)        # 9
sum_arr = np.sum(arr)        # 45

# Along axes
mean_axis0 = np.mean(arr, axis=0)    # [4. 5. 6.] (column means)
mean_axis1 = np.mean(arr, axis=1)    # [2. 5. 8.] (row means)
sum_axis0 = np.sum(arr, axis=0)      # [12 15 18] (column sums)
sum_axis1 = np.sum(arr, axis=1)      # [6 15 24] (row sums)


# Other statistical functions
percentile_50 = np.percentile(arr, 50) # 50th percentile (median)
quantile_75 = np.quantile(arr, 0.75)   # 75th percentile
corrcoef_arr = np.corrcoef(arr)        # Correlation matrix
cov_arr = np.cov(arr)                  # Covariance matrix

print("Mean of array:", mean_arr)
print("Median of array:", median_arr)
print("Standard deviation of array:", std_arr)
print("Variance of array:", var_arr)
print("Minimum element of array:", min_arr)
print("Maximum element of array:", max_arr)
print("Sum of array:", sum_arr)
print("\n")
print("Mean along axis 0:", mean_axis0)
print("Mean along axis 1:", mean_axis1)
print("Sum along axis 0:", sum_axis0)
print("Sum along axis 1:", sum_axis1)
print("\n")
print("50th percentile:", percentile_50)
print("75th percentile:", quantile_75)
print("Correlation matrix:\n", corrcoef_arr)
print("Covariance matrix:\n", cov_arr)

Mean of array: 5.0
Median of array: 5.0
Standard deviation of array: 2.581988897471611
Variance of array: 6.666666666666667
Minimum element of array: 1
Maximum element of array: 9
Sum of array: 45


Mean along axis 0: [4. 5. 6.]
Mean along axis 1: [2. 5. 8.]
Sum along axis 0: [12 15 18]
Sum along axis 1: [ 6 15 24]


50th percentile: 5.0
75th percentile: 7.0
Correlation matrix:
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Covariance matrix:
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


## 2.9 Array Manipulation
* * *

This section explores various techniques for manipulating the structure and content of NumPy arrays.

### Reshaping

This section demonstrates how to change the shape or dimensions of a NumPy array. The code block below shows how to use the `.reshape()` method to create arrays with different dimensions (including using -1 for auto-calculation) and how to flatten a multi-dimensional array into a 1D array using `.flatten()` and `.ravel()`.

In [None]:
arr = np.arange(12)

# Reshape
reshape_3_4 = arr.reshape(3, 4)         # 3x4 matrix
reshape_neg1_3 = arr.reshape(-1, 3)     # Auto-calculate rows, 3 columns
reshape_2_2_3 = arr.reshape(2, 2, 3)    # 3D array

# Flatten
arr_2d = arr.reshape(3, 4)
flattened_arr = arr_2d.flatten()        # 1D array (copy)
raveled_arr = arr_2d.ravel()            # 1D array (view if possible)

print("Original array:", arr)
print("\nReshape to (3, 4):\n", reshape_3_4)
print("\nReshape to (-1, 3):\n", reshape_neg1_3)
print("\nReshape to (2, 2, 3):\n", reshape_2_2_3)
print("\nFlattened array:", flattened_arr)
print("\nRaveled array:", raveled_arr)

Original array: [ 0  1  2  3  4  5  6  7  8  9 10 11]

Reshape to (3, 4):
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

Reshape to (-1, 3):
 [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]

Reshape to (2, 2, 3):
 [[[ 0  1  2]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]]

Flattened array: [ 0  1  2  3  4  5  6  7  8  9 10 11]

Raveled array: [ 0  1  2  3  4  5  6  7  8  9 10 11]


### Joining and Splitting

This section covers methods for combining multiple NumPy arrays and splitting a single array into multiple parts. The code block below illustrates the use of `np.concatenate()`, `np.vstack()`, and `np.hstack()` to join arrays vertically or horizontally, and `np.split()`, `np.hsplit()`, and `np.vsplit()` to split arrays along different axes.

In [None]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Concatenation
concatenated_axis0 = np.concatenate([a, b], axis=0)  # Vertical stack
concatenated_axis1 = np.concatenate([a, b], axis=1)  # Horizontal stack
vstack_ab = np.vstack([a, b])                        # Vertical stack
hstack_ab = np.hstack([a, b])                        # Horizontal stack

# Splitting
arr = np.arange(12).reshape(3, 4)
split_axis0 = np.split(arr, 3, axis=0)               # Split into 3 parts along rows
hsplit_arr = np.hsplit(arr, 2)                       # Horizontal split
vsplit_arr = np.vsplit(arr, 3)                       # Vertical split

print("Concatenated along axis 0:\n", concatenated_axis0)
print("\nConcatenated along axis 1:\n", concatenated_axis1)
print("\nvstack(a, b):\n", vstack_ab)
print("\nhstack(a, b):\n", hstack_ab)
print("\nSplit along axis 0 into 3 parts:\n", split_axis0)
print("\nhsplit(arr, 2):\n", hsplit_arr)
print("\nvsplit(arr, 3):\n", vsplit_arr)

Concatenated along axis 0:
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]

Concatenated along axis 1:
 [[1 2 5 6]
 [3 4 7 8]]

vstack(a, b):
 [[1 2]
 [3 4]
 [5 6]
 [7 8]]

hstack(a, b):
 [[1 2 5 6]
 [3 4 7 8]]

Split along axis 0 into 3 parts:
 [array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]

hsplit(arr, 2):
 [array([[0, 1],
       [4, 5],
       [8, 9]]), array([[ 2,  3],
       [ 6,  7],
       [10, 11]])]

vsplit(arr, 3):
 [array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]


### Adding and Removing Elements

This section demonstrates how to add or remove elements from a NumPy array. The code block below shows the use of `np.append()` to add elements to the end of an array, `np.insert()` to insert elements at specific indices, and `np.delete()` to remove elements at specified indices.

In [None]:
arr = np.array([1, 2, 3, 4, 5])

# Append
append_6 = np.append(arr, 6)               # [1 2 3 4 5 6]
append_6_7 = np.append(arr, [6, 7])        # [1 2 3 4 5 6 7]

# Insert
insert_99 = np.insert(arr, 2, 99)          # [1 2 99 3 4 5]
insert_multi = np.insert(arr, [1, 3], [99, 88]) # Multiple insertions

# Delete
delete_2 = np.delete(arr, 2)               # [1 2 4 5] (remove index 2)
delete_multi = np.delete(arr, [1, 3])      # [1 3 5] (remove indices 1,3)

print("Original array:", arr)
print("\nAppend 6:", append_6)
print("\nAppend [6, 7]:", append_6_7)
print("\nInsert 99 at index 2:", insert_99)
print("\nInsert multiple values:", insert_multi)
print("\nDelete element at index 2:", delete_2)
print("\nDelete elements at indices 1 and 3:", delete_multi)

Original array: [1 2 3 4 5]

Append 6: [1 2 3 4 5 6]

Append [6, 7]: [1 2 3 4 5 6 7]

Insert 99 at index 2: [ 1  2 99  3  4  5]

Insert multiple values: [ 1 99  2  3 88  4  5]

Delete element at index 2: [1 2 4 5]

Delete elements at indices 1 and 3: [1 3 5]


## 2.10 Linear Algebra
* * *

This section introduces essential linear algebra operations that can be performed using NumPy's `linalg` module. The code block below demonstrates matrix multiplication using `np.dot()`, `@`, and `np.matmul()`, matrix operations like transpose (`np.transpose()`, `.T`), inverse (`np.linalg.inv()`), determinant (`np.linalg.det()`), and trace (`np.trace()`). It also shows how to compute eigenvalues and eigenvectors (`np.linalg.eig()`), solve linear systems (`np.linalg.solve()`), and perform Singular Value Decomposition (SVD) using `np.linalg.svd()`. The distinction between element-wise and matrix operations is also highlighted.

In [None]:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

# Matrix multiplication
dot_ab = np.dot(a, b)                 # Matrix multiplication
matmul_ab_at = a @ b                  # Alternative syntax (Python 3.5+)
matmul_ab = np.matmul(a, b)           # Matrix multiplication

# Matrix operations
transpose_a = np.transpose(a)         # Transpose
a_T = a.T                             # Transpose (shorthand)
a_transpose_method = a.transpose()                         # Transpose (method)
inv_a = np.linalg.inv(a)              # Inverse
det_a = np.linalg.det(a)              # Determinant
trace_a = np.trace(a)                 # Trace (sum of diagonal)

# Eigenvalues and eigenvectors
eigenvals, eigenvecs = np.linalg.eig(a)

# Solving linear systems (Ax = b)
A = np.array([[2, 1], [1, 3]])
b_vec = np.array([1, 2])
x = np.linalg.solve(A, b_vec)

# SVD (Singular Value Decomposition)
U, s, Vt = np.linalg.svd(a)

# Important: Element-wise vs Matrix operations
# * is element-wise multiplication
# Use np.dot(), @ or np.matmul() for matrix multiplication

print("\nMatrix multiplication (np.dot):\n", dot_ab)
print("\nMatrix multiplication (a @ b):\n", matmul_ab_at)
print("\nMatrix multiplication (np.matmul):\n", matmul_ab)
print("\nTranspose of a (np.transpose):\n", transpose_a)
print("\nTranspose of a (a.T):\n", a_T)
print("\nTranspose of a (a.transpose()):\n", a_transpose_method)
print("\nInverse of a:\n", inv_a)
print("\nDeterminant of a:", det_a)
print("\nTrace of a:", trace_a)
print("\nEigenvalues of a:", eigenvals)
print("\nEigenvectors of a:\n", eigenvecs)
print("\nSolution to Ax = b (x):\n", x)
print("\nSVD of a:\nU:\n", U)
print("\ns:\n", s)
print("\nVt:\n", Vt)


Matrix multiplication (np.dot):
 [[19 22]
 [43 50]]

Matrix multiplication (a @ b):
 [[19 22]
 [43 50]]

Matrix multiplication (np.matmul):
 [[19 22]
 [43 50]]

Transpose of a (np.transpose):
 [[1 3]
 [2 4]]

Transpose of a (a.T):
 [[1 3]
 [2 4]]

Transpose of a (a.transpose()):
 [[1 3]
 [2 4]]

Inverse of a:
 [[-2.   1. ]
 [ 1.5 -0.5]]

Determinant of a: -2.0000000000000004

Trace of a: 5

Eigenvalues of a: [-0.37228132  5.37228132]

Eigenvectors of a:
 [[-0.82456484 -0.41597356]
 [ 0.56576746 -0.90937671]]

Solution to Ax = b (x):
 [0.2 0.6]

SVD of a:
U:
 [[-0.40455358 -0.9145143 ]
 [-0.9145143   0.40455358]]

s:
 [5.4649857  0.36596619]

Vt:
 [[-0.57604844 -0.81741556]
 [ 0.81741556 -0.57604844]]


## 2.11 Broadcasting
* * *
Broadcasting allows operations between arrays of different shapes:

The code block below illustrates how broadcasting works in NumPy, allowing operations between arrays of different shapes and between arrays and scalars. It shows examples of adding a scalar to an array and adding arrays with different dimensions, demonstrating how NumPy automatically adjusts the shapes for compatible operations based on broadcasting rules.

In [None]:
# Scalar and array
arr = np.array([1, 2, 3, 4])
arr_plus_5 = arr + 5                # [6 7 8 9]
print("Scalar and array addition (arr + 5):", arr_plus_5)

# Different shaped arrays
a = np.array([[1, 2, 3]])           # (1, 3)
b = np.array([[1], [2], [3]])       # (3, 1)
result_a_plus_b = a + b             # (3, 3) result
print("\nAddition of different shaped arrays (a + b):\n", result_a_plus_b)

# Broadcasting rules:
# 1. Arrays are aligned from the rightmost dimension
# 2. Dimensions of size 1 are stretched
# 3. Missing dimensions are assumed to be of size 1

# Example from lab: element-wise operations
x = np.array([[1, 2], [3, 4]])
y = np.array([[5, 6], [7, 8]])

# All these are element-wise operations
sum_xy = x + y
diff_xy = x - y
prod_xy = x * y
div_xy = x / y
pow_xy = x ** y
mod_xy = x % y
floor_div_xy = x // y

print("\nElement-wise addition (x + y):\n", sum_xy)
print("Element-wise subtraction (x - y):\n", diff_xy)
print("Element-wise multiplication (x * y):\n", prod_xy)
print("Element-wise division (x / y):\n", div_xy)
print("Element-wise exponentiation (x ** y):\n", pow_xy)
print("Element-wise modulus (x % y):\n", mod_xy)
print("Element-wise floor division (x // y):\n", floor_div_xy)
print("\nMatrix multiplication (x @ y):\n", x @ y)
print("Matrix multiplication (x.dot(y)):\n", x.dot(y))

print("\n\nAlso available as functions:\n")
print("np.add(x, y):\n", np.add(x, y))
print("np.subtract(x, y):\n", np.subtract(x, y))
print("np.multiply(x, y):\n", np.multiply(x, y))
print("np.divide(x, y):\n", np.divide(x, y))

Scalar and array addition (arr + 5): [6 7 8 9]

Addition of different shaped arrays (a + b):
 [[2 3 4]
 [3 4 5]
 [4 5 6]]

Element-wise addition (x + y):
 [[ 6  8]
 [10 12]]
Element-wise subtraction (x - y):
 [[-4 -4]
 [-4 -4]]
Element-wise multiplication (x * y):
 [[ 5 12]
 [21 32]]
Element-wise division (x / y):
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]
Element-wise exponentiation (x ** y):
 [[    1    64]
 [ 2187 65536]]
Element-wise modulus (x % y):
 [[1 2]
 [3 4]]
Element-wise floor division (x // y):
 [[0 0]
 [0 0]]

Matrix multiplication (x @ y):
 [[19 22]
 [43 50]]
Matrix multiplication (x.dot(y)):
 [[19 22]
 [43 50]]


Also available as functions:

np.add(x, y):
 [[ 6  8]
 [10 12]]
np.subtract(x, y):
 [[-4 -4]
 [-4 -4]]
np.multiply(x, y):
 [[ 5 12]
 [21 32]]
np.divide(x, y):
 [[0.2        0.33333333]
 [0.42857143 0.5       ]]


## 2.12 Conditional Operations
* * *

This section covers performing operations on arrays based on conditions. The code block below demonstrates the use of `np.where()` to select elements based on a boolean condition, `np.select()` to apply different operations based on multiple conditions, and `np.clip()` to limit array values to a specified range.

In [None]:
arr = np.array([1, 2, 3, 4, 5])

# Where function
where_arr = np.where(arr > 3, arr, 0)          # [0 0 0 4 5] (replace ≤3 with 0)
where_indices = np.where(arr > 3)              # (array([3, 4]),) indices where True

# Select and choose
conditions = [arr < 2, arr > 4]
choices = [arr * 10, arr * 100]
select_arr = np.select(conditions, choices, default=arr)  # Apply different operations

# Clip values
clip_arr = np.clip(arr, 2, 4)                   # [2 2 3 4 4] (clamp between 2 and 4)

print("np.where(arr > 3, arr, 0):", where_arr)
print("np.where(arr > 3):", where_indices)
print("\n")
print("np.select(conditions, choices, default=arr):", select_arr)
print("\n")
print("np.clip(arr, 2, 4):", clip_arr)

np.where(arr > 3, arr, 0): [0 0 0 4 5]
np.where(arr > 3): (array([3, 4]),)


np.select(conditions, choices, default=arr): [ 10   2   3   4 500]


np.clip(arr, 2, 4): [2 2 3 4 4]


## 2.13 Sorting and Searching
* * *

This section explores methods for sorting elements within NumPy arrays and searching for specific values or their indices. The code block below shows how to sort arrays using `np.sort()` (returning a sorted copy) and the `.sort()` method (sorting in-place), and how to get the indices of sorted elements using `np.argsort()`. It also demonstrates sorting in multi-dimensional arrays along specific axes. For searching, it covers `np.searchsorted()` to find insertion points, `np.argmax()` and `np.argmin()` to find indices of maximum/minimum values, and `np.nonzero()` to find indices of non-zero elements.

In [None]:
arr = np.array([3, 1, 4, 1, 5, 9, 2, 6])

# Sorting
sorted_arr = np.sort(arr)            # [1 1 2 3 4 5 6 9] (returns sorted copy)
arr.sort()                           # Sort in-place
argsort_arr = np.argsort(arr)        # [1 3 6 0 2 7 4 5] (indices of sorted elements)

# Multi-dimensional sorting
arr_2d = np.array([[3, 1, 4], [1, 5, 9]])
sorted_arr_2d_axis1 = np.sort(arr_2d, axis=1)        # Sort each row
sorted_arr_2d_axis0 = np.sort(arr_2d, axis=0)        # Sort each column

# Searching
searchsorted_result = np.searchsorted(np.sort(arr), 4)  # Index where 4 should be inserted
argmax_arr = np.argmax(arr)          # Index of maximum value
argmin_arr = np.argmin(arr)          # Index of minimum value
nonzero_arr = np.nonzero(arr > 3)    # Indices of non-zero elements

print("Original array:", arr) # Note: arr is sorted in-place by arr.sort()
print("\nSorted array (copy):", sorted_arr)
print("\nArgsort of array:", argsort_arr)
print("\nMulti-dimensional arrays:")
print("\nSorted 2D array along axis 1:\n", sorted_arr_2d_axis1)
print("\nSorted 2D array along axis 0:\n", sorted_arr_2d_axis0)
print("\n")
print("\nSearchsorted result for 4:", searchsorted_result)
print("\nIndex of maximum value:", argmax_arr)
print("\nIndex of minimum value:", argmin_arr)
print("\nIndices of non-zero elements (arr > 3):", nonzero_arr)

Original array: [1 1 2 3 4 5 6 9]

Sorted array (copy): [1 1 2 3 4 5 6 9]

Argsort of array: [0 1 2 3 4 5 6 7]

Multi-dimensional arrays:

Sorted 2D array along axis 1:
 [[1 3 4]
 [1 5 9]]

Sorted 2D array along axis 0:
 [[1 1 4]
 [3 5 9]]



Searchsorted result for 4: 4

Index of maximum value: 7

Index of minimum value: 0

Indices of non-zero elements (arr > 3): (array([4, 5, 6, 7]),)


## 2.14 Set Operations
* * *

This section introduces set operations that can be performed on NumPy arrays. The code block below demonstrates how to find unique elements in an array using `np.unique()` and how to perform set operations like intersection (`np.intersect1d()`), union (`np.union1d()`), set difference (`np.setdiff1d()`), and symmetric difference (`np.setxor1d()`) between two arrays. It also shows how to test for element-wise membership using `np.isin()`.

In [None]:
a = np.array([1, 2, 3, 4, 5])
b = np.array([3, 4, 5, 6, 7])

# Unique values
unique_a = np.unique(a)                        # Remove duplicates
unique_repeated = np.unique([1, 1, 2, 2, 3])   # [1 2 3]

# Set operations
intersect_ab = np.intersect1d(a, b)            # [3 4 5] (intersection)
union_ab = np.union1d(a, b)                    # [1 2 3 4 5 6 7] (union)
setdiff_ab = np.setdiff1d(a, b)                # [1 2] (elements in a but not in b)
setxor_ab = np.setxor1d(a, b)                  # [1 2 6 7] (symmetric difference)

# Membership testing
in1d_ab = np.isin(a, b)                        # [False False True True True] (element-wise)

print("Unique values in a:", unique_a)
print("Unique values in [1, 1, 2, 2, 3]:", unique_repeated)
print("\nIntersection of a and b:", intersect_ab)
print("Union of a and b:", union_ab)
print("Set difference (a - b):", setdiff_ab)
print("Symmetric difference (a XOR b):", setxor_ab)
print("\nMembership testing (a in b):", in1d_ab)

Unique values in a: [1 2 3 4 5]
Unique values in [1, 1, 2, 2, 3]: [1 2 3]

Intersection of a and b: [3 4 5]
Union of a and b: [1 2 3 4 5 6 7]
Set difference (a - b): [1 2]
Symmetric difference (a XOR b): [1 2 6 7]

Membership testing (a in b): [False False  True  True  True]


## 2.15 File I/O
* * *
This section covers how to save and load NumPy arrays to and from files in various formats. The code block below demonstrates saving and loading arrays in binary format using `np.save()` and `np.load()` (.npy files), saving and loading multiple arrays in compressed binary format using `np.savez()` (.npz files), and saving and loading arrays to/from text files using `np.savetxt()` and `np.loadtxt()` (.txt, .csv files). It also shows how to load data from CSV files with headers using `np.genfromtxt()`.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

# Save and load binary format (.npy)
np.save('array.npy', arr)
loaded_arr_npy = np.load('array.npy')

# Save and load multiple arrays (.npz)
np.savez('arrays.npz', a=arr, b=arr*2)
data = np.load('arrays.npz')
arr_a_npz = data['a']
arr_b_npz = data['b']

# Text format
np.savetxt('array.txt', arr, delimiter=',')
loaded_arr_txt = np.loadtxt('array.txt', delimiter=',')

# CSV files
np.savetxt('data.csv', arr, delimiter=',', header='col1,col2,col3')
loaded_arr_csv = np.genfromtxt('data.csv', delimiter=',', names=True)

print("Original array:\n", arr)
print("\nLoaded from .npy:\n", loaded_arr_npy)
print("\nLoaded 'a' from .npz:\n", arr_a_npz)
print("\nLoaded 'b' from .npz:\n", arr_b_npz)
print("\nLoaded from .txt:\n", loaded_arr_txt)
print("\nLoaded from .csv:\n", loaded_arr_csv)

Original array:
 [[1 2 3]
 [4 5 6]]

Loaded from .npy:
 [[1 2 3]
 [4 5 6]]

Loaded 'a' from .npz:
 [[1 2 3]
 [4 5 6]]

Loaded 'b' from .npz:
 [[ 2  4  6]
 [ 8 10 12]]

Loaded from .txt:
 [[1. 2. 3.]
 [4. 5. 6.]]

Loaded from .csv:
 [(1., 2., 3.) (4., 5., 6.)]


## 2.16 Memory and Performance Tips
* * *
This section provides insights and techniques for optimizing the memory usage and performance of NumPy code. The code block below demonstrates how to check the memory usage of arrays using `.nbytes`, explains the concept of views versus copies and shows how to create both, highlights the efficiency of vectorized operations over loops, suggests pre-allocating arrays for better performance, and emphasizes the importance of choosing appropriate data types using `dtype`. It also includes a placeholder for profiling code using the `time` module to measure execution time.

In [None]:
# Check memory usage
arr = np.random.random((1000, 1000))
memory_usage_mb = arr.nbytes / (1024**2)  # Size in MB
print("Memory usage of arr (MB):", memory_usage_mb)

# Views vs Copies
view = arr[::2]         # Creates a view (shares memory)
copy = arr.copy()       # Creates a copy (new memory)
print("\nView of arr:", view)
print("\nCopy of arr:", copy)

# Efficient operations
# Use vectorized operations instead of loops
# Bad: [func(x) for x in arr]
# Good: np.vectorize(func)(arr) or ufuncs

# Pre-allocate arrays when possible
result = np.empty((1000, 1000))  # Faster than growing arrays
print("\nExample of pre-allocated empty array:\n", result)

# Use appropriate data types
arr_int8 = np.array([1, 2, 3], dtype=np.int8)        # 1 byte per element
arr_float32 = np.array([1, 2, 3], dtype=np.float32)  # 4 bytes per element
print("\nArray with int8 dtype:", arr_int8)
print("Array with float32 dtype:", arr_float32)

Memory usage of arr (MB): 7.62939453125

View of arr: [[0.92190893 0.70724216 0.52847754 ... 0.09442034 0.98520057 0.51626479]
 [0.4088004  0.61111537 0.72345265 ... 0.97819973 0.55819151 0.6605068 ]
 [0.20360578 0.50155724 0.98770201 ... 0.24047394 0.87124846 0.08890864]
 ...
 [0.52942843 0.89494881 0.46173364 ... 0.8773214  0.98077745 0.47723822]
 [0.85089488 0.52020892 0.26660188 ... 0.21811735 0.54660188 0.9839324 ]
 [0.08523377 0.38215476 0.89299516 ... 0.62027585 0.40420734 0.06446211]]

Copy of arr: [[0.92190893 0.70724216 0.52847754 ... 0.09442034 0.98520057 0.51626479]
 [0.73143814 0.11538532 0.64219595 ... 0.31137888 0.87847722 0.03859866]
 [0.4088004  0.61111537 0.72345265 ... 0.97819973 0.55819151 0.6605068 ]
 ...
 [0.4812199  0.5665116  0.00458156 ... 0.4137686  0.71159447 0.53142082]
 [0.08523377 0.38215476 0.89299516 ... 0.62027585 0.40420734 0.06446211]
 [0.06812008 0.36743082 0.56128752 ... 0.35927188 0.72049318 0.5264901 ]]

Example of pre-allocated empty array:
 [[0.

## 2.17 Common Patterns in ML/AI
* * *

This section provides examples of how NumPy is used to implement common patterns and functions in the field of machine learning and artificial intelligence. The code block below defines and demonstrates functions for data normalization (zero mean, unit variance), min-max scaling, one-hot encoding of categorical labels, generating train-test split indices, computing a confusion matrix, and implementing the softmax and sigmoid activation functions. These examples showcase the practical application of NumPy in building fundamental components of ML/AI models.

In [None]:
# Normalize data (zero mean, unit variance)
def normalize(X):
    # Calculates the mean and standard deviation of the input array X along axis 0 (columns)
    # Subtracts the mean from X and divides by the standard deviation
    return (X - np.mean(X, axis=0)) / np.std(X, axis=0)

# Min-max scaling
def min_max_scale(X):
    # Calculates the minimum and maximum values of the input array X along axis 0 (columns)
    # Scales the data to a range between 0 and 1
    return (X - np.min(X, axis=0)) / (np.max(X, axis=0) - np.min(X, axis=0))

# One-hot encoding
def one_hot_encode(y, num_classes):
    # Creates a one-hot encoded representation of the input array y
    # np.eye(num_classes) creates an identity matrix, and indexing with y selects the appropriate rows
    return np.eye(num_classes)[y]

# Train-test split indices
def train_test_split_indices(n_samples, test_size=0.2, random_state=None):
    # Generates indices for splitting a dataset into training and testing sets
    # Randomly permutes the indices and splits them based on the test_size
    if random_state:
        np.random.seed(random_state)
    indices = np.random.permutation(n_samples)
    test_size = int(n_samples * test_size)
    return indices[test_size:], indices[:test_size]

# Confusion matrix
def confusion_matrix(y_true, y_pred, num_classes):
    # Computes a confusion matrix
    # Counts the occurrences of predicted classes versus true classes
    cm = np.zeros((num_classes, num_classes), dtype=int)
    for i in range(len(y_true)):
        cm[y_true[i], y_pred[i]] += 1
    return cm

# Softmax function
def softmax(x):
    # Implements the softmax activation function
    # Converts a vector of numbers into a probability distribution
    exp_x = np.exp(x - np.max(x, axis=-1, keepdims=True))
    return exp_x / np.sum(exp_x, axis=-1, keepdims=True)

# Sigmoid function
def sigmoid(x):
    # Implements the sigmoid activation function
    # Squashes values between 0 and 1
    return 1 / (1 + np.exp(-np.clip(x, -250, 250)))  # Clip to prevent overflow

# Example Usage and Print Statements
data_for_normalize = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
normalized_data = normalize(data_for_normalize)
print("Normalized Data:\n", normalized_data)

data_for_minmax = np.array([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
minmax_scaled_data = min_max_scale(data_for_minmax)
print("\nMin-Max Scaled Data:\n", minmax_scaled_data)

labels_for_onehot = np.array([0, 1, 2, 1])
num_classes = 3
onehot_encoded_labels = one_hot_encode(labels_for_onehot, num_classes)
print("\nOne-Hot Encoded Labels:\n", onehot_encoded_labels)

n_samples = 100
train_indices, test_indices = train_test_split_indices(n_samples, test_size=0.3, random_state=42)
print("\nTrain Indices (first 10):", train_indices[:10])
print("Test Indices (first 10):", test_indices[:10])

y_true = np.array([0, 1, 2, 1, 0, 2])
y_pred = np.array([0, 2, 1, 1, 0, 2])
num_classes_cm = 3
cm = confusion_matrix(y_true, y_pred, num_classes_cm)
print("\nConfusion Matrix:\n", cm)

data_for_softmax = np.array([1.0, 2.0, 3.0])
softmax_output = softmax(data_for_softmax)
print("\nSoftmax Output:", softmax_output)

data_for_sigmoid = np.array([-1.0, 0.0, 1.0])
sigmoid_output = sigmoid(data_for_sigmoid)
print("\nSigmoid Output:", sigmoid_output)

Normalized Data:
 [[-1.22474487 -1.22474487]
 [ 0.          0.        ]
 [ 1.22474487  1.22474487]]

Min-Max Scaled Data:
 [[0.  0. ]
 [0.5 0.5]
 [1.  1. ]]

One-Hot Encoded Labels:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [0. 1. 0.]]

Train Indices (first 10): [11 47 85 28 93  5 66 65 35 16]
Test Indices (first 10): [83 53 70 45 44 39 22 80 10  0]

Confusion Matrix:
 [[2 0 0]
 [0 1 1]
 [0 1 1]]

Softmax Output: [0.09003057 0.24472847 0.66524096]

Sigmoid Output: [0.26894142 0.5        0.73105858]


## 2.18 Performance Optimization
* * *

This section reiterates and expands on performance optimization strategies in NumPy. The code block below demonstrates how to use built-in vectorized functions instead of explicit Python loops for faster computations, advises against repeated array creation within loops, illustrates the concept of views for memory efficiency (avoiding unnecessary copying), reinforces the idea of pre-allocating arrays, and emphasizes the impact of choosing appropriate data types on memory usage and performance. It also includes an example of how to use the `time` module for basic code profiling to measure the execution time of NumPy operations.

In [None]:
# Use built-in functions instead of loops
# Slow
result = []
for i in range(len(arr)):
    result.append(arr[i] ** 2)

# Fast
result = np.square(arr) # Example of using vectorized operation

# Avoid repeated array creation
# Slow
for i in range(1000):
    temp = np.zeros(1000)
    # ... operations on temp

# Fast
temp = np.zeros(1000) # Pre-allocate
for i in range(1000):
    temp[:] = 0  # Reset existing array
    # ... operations on temp

# Use views when possible
large_arr = np.random.random((10000, 10000))
subset = large_arr[1000:2000, 1000:2000]  # View, no copying
processed = subset * 2  # Operations on view
print("Example of using views (a subset of a large array):\n", subset)
print("\nExample of operations on a view (subset * 2):\n", processed)


# Pre-allocate arrays when possible
result_empty = np.empty((1000, 1000))  # Faster than growing arrays
print("\nExample of pre-allocated empty array:\n", result_empty)

# Use appropriate data types
arr_int8 = np.array([1, 2, 3], dtype=np.int8)        # 1 byte per element
arr_float32 = np.array([1, 2, 3], dtype=np.float32)  # 4 bytes per element
print("\nArray with int8 dtype:", arr_int8)
print("Array with float32 dtype:", arr_float32)

# Profile your code
import time
start = time.time()
# ... your numpy operations
end = time.time()
print(f"\nTime taken for placeholder operation: {end - start:.4f} seconds") # Added print for profiling example

Example of using views (a subset of a large array):
 [[0.84401209 0.50725286 0.48937041 ... 0.84120925 0.30046148 0.2289248 ]
 [0.22344869 0.7731329  0.15963621 ... 0.24206448 0.85633473 0.48865508]
 [0.04501997 0.44642339 0.32832086 ... 0.28455701 0.068467   0.37749544]
 ...
 [0.95825424 0.76336171 0.60841453 ... 0.32137478 0.75872874 0.12262849]
 [0.98450749 0.6049895  0.78289335 ... 0.05524097 0.94277093 0.87823897]
 [0.26686602 0.92532791 0.2924576  ... 0.20935596 0.7205089  0.18297257]]

Example of operations on a view (subset * 2):
 [[1.68802418 1.01450572 0.97874082 ... 1.68241849 0.60092296 0.4578496 ]
 [0.44689739 1.5462658  0.31927242 ... 0.48412895 1.71266946 0.97731016]
 [0.09003994 0.89284678 0.65664173 ... 0.56911401 0.13693401 0.75499089]
 ...
 [1.91650847 1.52672341 1.21682905 ... 0.64274955 1.51745749 0.24525698]
 [1.96901498 1.20997899 1.5657867  ... 0.11048193 1.88554186 1.75647793]
 [0.53373205 1.85065581 0.58491519 ... 0.41871193 1.4410178  0.36594515]]

Example of

---
---
## Quick Reference Summary

This notebook section demonstrates various functionalities of the NumPy library. Here's a summary of the key functions and concepts used:

**Array Creation:**

* `np.array()`: Creates NumPy arrays from Python lists or other array-like objects.
* `np.zeros()`, `np.ones()`: Create arrays filled with zeros or ones.
* `np.eye()`, `np.identity()`: Create identity matrices.
* `np.full()`: Create arrays filled with a specific value.
* `np.full_like()`: Create an array of the same shape and type as another array, filled with a specific value.
* `np.empty()`: Create arrays with uninitialized (random) values.
* `np.arange()`: Create arrays with regularly spaced values within a given range.
* `np.linspace()`: Create arrays with a specified number of evenly spaced values within a given interval.
* `np.random.random()`: Create arrays with random floats in the range [0.0, 1.0).
* `np.random.randint()`: Create arrays with random integers within a given range.
* `np.random.randn()`: Create arrays with random samples from a standard normal distribution.
* `np.random.normal()`: Create arrays with random samples from a normal distribution.
* `np.random.uniform()`: Create arrays with random samples from a uniform distribution.
* `np.random.seed()`: Set the random seed for reproducibility.

**Array Properties and Information:**

* `.shape`: Tuple of array dimensions.
* `.size`: Total number of elements in the array.
* `.ndim`: Number of dimensions of the array.
* `.dtype`: Data type of the array elements.
* `.itemsize`: Size in bytes of each element.
* `.nbytes`: Total bytes consumed by the array data.
* `.flags`: Information about the memory layout of the array.
* `.data`: Buffer containing the array's actual data.

**Array Indexing and Slicing:**

* Basic indexing (`arr[index]`, `arr[start:stop:step]`): Accessing single elements or slices.
* Multi-dimensional indexing (`arr_2d[row, column]`, `arr_2d[row_slice, column_slice]`): Accessing elements or subarrays in multi-dimensional arrays.
* Boolean indexing (`arr[boolean_array]`): Selecting elements based on a boolean condition.
* Fancy indexing (`arr[list_of_indices]`): Selecting elements using a list or array of indices.

**Array Operations:**

* Element-wise arithmetic operations (`+`, `-`, `*`, `/`, `**`, `%`).
* Scalar operations (`arr + scalar`, `arr * scalar`).
* `np.sqrt()`, `np.square()`, `np.abs()`, `np.sign()`: Basic mathematical functions.
* `np.sin()`, `np.cos()`, `np.tan()`: Trigonometric functions.
* `np.exp()`, `np.log()`, `np.log10()`, `np.log2()`: Exponential and logarithmic functions.
* `np.round()`, `np.floor()`, `np.ceil()`: Rounding functions.

**Vectorization and Array Operations:**

* Vectorized operations for faster computations.
* Vector-scalar operations.
* Element-wise operations.
* Broadcasting: Operations between arrays of different shapes.

**Data Type Considerations:**

* Automatic data type inference.
* Explicit data type specification using `dtype`.
* Data type conversion using `.astype()`.

**Statistical Operations:**

* `np.mean()`, `np.median()`, `np.std()`, `np.var()`: Calculate mean, median, standard deviation, and variance.
* `np.min()`, `np.max()`, `np.sum()`: Find minimum, maximum, and sum of elements.
* Operations along axes (`axis=0` for columns, `axis=1` for rows).
* `np.percentile()`, `np.quantile()`: Calculate percentiles and quantiles.
* `np.corrcoef()`, `np.cov()`: Calculate correlation and covariance matrices.

**Array Manipulation:**

* `.reshape()`: Change the shape of an array.
* `.flatten()`, `.ravel()`: Convert a multi-dimensional array to a 1D array.
* `np.concatenate()`, `np.vstack()`, `np.hstack()`: Join arrays along different axes.
* `np.split()`, `np.hsplit()`, `np.vsplit()`: Split arrays into multiple subarrays.
* `np.append()`: Add elements to the end of an array.
* `np.insert()`: Insert elements at specific indices.
* `np.delete()`: Remove elements at specific indices.

**Linear Algebra:**

* `np.dot()`, `np.matmul()`, `@`: Matrix multiplication.
* `np.transpose()`, `.T`: Transpose a matrix.
* `np.linalg.inv()`: Calculate the inverse of a matrix.
* `np.linalg.det()`: Calculate the determinant of a matrix.
* `np.trace()`: Calculate the trace of a matrix.
* `np.linalg.eig()`: Compute the eigenvalues and eigenvectors of a square matrix.
* `np.linalg.solve()`: Solve a linear system of equations.
* `np.linalg.svd()`: Perform Singular Value Decomposition.

**Broadcasting:**

* Implicit mechanism that allows operations between arrays of different shapes under certain rules.

**Conditional Operations:**

* `np.where()`: Select elements from `x` or `y` depending on `condition`.
* `np.select()`: Perform operations based on multiple conditions and choices.
* `np.clip()`: Limit the values in an array to a specified range.

**Sorting and Searching:**

* `np.sort()`: Return a sorted copy of an array.
* `.sort()`: Sort an array in-place.
* `np.argsort()`: Return the indices that would sort an array.
* `np.searchsorted()`: Find indices where elements should be inserted to maintain order.
* `np.argmax()`, `np.argmin()`: Return the indices of the maximum or minimum values.
* `np.nonzero()`: Return the indices of the elements that are non-zero.

**Set Operations:**

* `np.unique()`: Find the unique elements of an array.
* `np.intersect1d()`: Find the intersection of two arrays.
* `np.union1d()`: Find the union of two arrays.
* `np.setdiff1d()`: Find the set difference of two arrays.
* `np.setxor1d()`: Find the symmetric difference of two arrays.
* `np.in1d()` (deprecated, use `np.isin()`): Test element-wise for membership in another array.

**File I/O:**

* `np.save()`, `np.load()`: Save and load arrays in binary format (.npy).
* `np.savez()`, `np.load()`: Save and load multiple arrays in compressed binary format (.npz).
* `np.savetxt()`, `np.loadtxt()`: Save and load arrays to and from text files (.txt, .csv).
* `np.genfromtxt()`: Load data from a text file, with support for missing values.

**Memory and Performance Tips:**

* Understanding memory usage (`.nbytes`).
* Views vs. Copies.
* Vectorized operations.
* Pre-allocating arrays.
* Choosing appropriate data types (`dtype`).
* Profiling code (using the `time` module).

**Common Patterns in ML/AI:**

* Implementation of functions for normalization, min-max scaling, one-hot encoding, train-test split indices, confusion matrix, softmax, and sigmoid.

---
---
**`Mirza Naeem Beg`**<br>
`Final Year UG Student,`<br>
`Dept. of CSE,` [**`AUST`**](https://aust.edu/)

[`Learn more about me;`](https://mirzanaeembeg.github.io/)
---
---
---