
# NumPy Training Notebook



## 1. Introduction to NumPy
NumPy stands for **Numerical Python**. It is a core library for numerical and matrix operations in Python.

### Importance:
- Provides support for **multi-dimensional arrays**.
- Optimized for **speed** (written in C).
- Essential for **data preprocessing, linear algebra, and ML model computations**.


In [2]:

import numpy as np

# Checking NumPy version
print(np.__version__)


2.1.3



## 2. Array Creation
Arrays are the foundation of NumPy. They store homogeneous data types.

### Common Methods to Create Arrays:
- `np.array()` – From Python list or tuple.
- `np.zeros()`, `np.ones()`, `np.full()` – For initialized arrays.
- `np.arange()`, `np.linspace()` – For numerical sequences.


In [3]:

a = np.array([1, 2, 3, 4])
b = np.zeros((2, 3))
c = np.ones((3, 3))
d = np.full((2, 2), 5)
e = np.arange(0, 10, 2)
f = np.linspace(0, 1, 5)

print("Array a:", a)
print("Zeros array:\n", b)
print("Ones array:\n", c)
print("Full array:\n", d)
print("Arange array:", e)
print("Linspace array:", f)


Array a: [1 2 3 4]
Zeros array:
 [[0. 0. 0.]
 [0. 0. 0.]]
Ones array:
 [[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
Full array:
 [[5 5]
 [5 5]]
Arange array: [0 2 4 6 8]
Linspace array: [0.   0.25 0.5  0.75 1.  ]



## 3. Array Attributes and Shape Manipulation
Understanding array shape, size, and type is essential in AI data preprocessing.


In [4]:

arr = np.array([[1,2,3],[4,5,6]])
print("Shape:", arr.shape)
print("Dimensions:", arr.ndim)
print("Size:", arr.size)
print("Datatype:", arr.dtype)

# Reshape the array
reshaped = arr.reshape(3,2)
print("Reshaped array:\n", reshaped)


Shape: (2, 3)
Dimensions: 2
Size: 6
Datatype: int64
Reshaped array:
 [[1 2]
 [3 4]
 [5 6]]



## 4. Indexing and Slicing
NumPy allows powerful data extraction operations for preprocessing large datasets.


In [5]:

arr = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])
print("Original array:\n", arr)

# Basic slicing
print("First row:", arr[0])
print("Element (2,3):", arr[1,2])
print("Subarray:\n", arr[0:2, 1:3])

# Boolean indexing
mask = arr > 50
print("Elements greater than 50:", arr[mask])


Original array:
 [[10 20 30]
 [40 50 60]
 [70 80 90]]
First row: [10 20 30]
Element (2,3): 60
Subarray:
 [[20 30]
 [50 60]]
Elements greater than 50: [60 70 80 90]



## 5. Mathematical Operations
NumPy performs element-wise operations very efficiently.


In [6]:

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])

print("Addition:", x + y)
print("Subtraction:", x - y)
print("Multiplication:", x * y)
print("Division:", y / x)
print("Square root:", np.sqrt(x))
print("Exponent:", np.exp(x))
print("Logarithm:", np.log(x))


Addition: [5 7 9]
Subtraction: [-3 -3 -3]
Multiplication: [ 4 10 18]
Division: [4.  2.5 2. ]
Square root: [1.         1.41421356 1.73205081]
Exponent: [ 2.71828183  7.3890561  20.08553692]
Logarithm: [0.         0.69314718 1.09861229]


In [7]:

data = np.array([[1, 2, 3], [4, 5, 6]])
print("Sum:", np.sum(data))
print("Mean:", np.mean(data))
print("Standard Deviation:", np.std(data))
print("Max:", np.max(data))
print("Min:", np.min(data))
print("Column-wise sum:", np.sum(data, axis=0))


Sum: 21
Mean: 3.5
Standard Deviation: 1.707825127659933
Max: 6
Min: 1
Column-wise sum: [5 7 9]


In [8]:
| Operation              | Mathematical Formula                                    | Result    | Description           |
| ---------------------- | ------------------------------------------------------- | --------- | --------------------- |
| `np.sum(data)`         | ( \sum A_{ij} )                                         | 21        | Total of all elements |
| `np.mean(data)`        | ( \frac{\sum A_{ij}}{N} )                               | 3.5       | Central average       |
| `np.std(data)`         | ( \sqrt{\frac{1}{N}\sum(x_i - \mu)^2} )                 | 1.7078    | Data spread           |
| `np.max(data)`         | ( \max(A_{ij}) )                                        | 6         | Largest value         |
| `np.min(data)`         | ( \min(A_{ij}) )                                        | 1         | Smallest value        |
| `np.sum(data, axis=0)` | ( [\sum_{i} A_{i1}, \sum_{i} A_{i2}, \sum_{i} A_{i3}] ) | [5, 7, 9] | Column-wise sum       |


SyntaxError: invalid syntax (4116844919.py, line 1)


## 6. Aggregation Functions
Used to summarize or analyze numerical data (e.g., during dataset normalization).



## 7. Broadcasting
Allows arithmetic operations between arrays of different shapes. Very important in AI feature scaling and batch operations.


In [9]:

A = np.array([[1, 2, 3], [4, 5, 6]])
B = np.array([1, 0, 1])
print("Broadcasted result:\n", A + B)


Broadcasted result:
 [[2 2 4]
 [5 5 7]]



## 8. Linear Algebra Operations
Linear algebra is the mathematical backbone of AI and ML.


In [11]:

from numpy import linalg as LA

M = np.array([[1, 2], [3, 4]])
print("Transpose:\n", M.T)
print("Determinant:", LA.det(M))
print("Inverse:\n", LA.inv(M))


Transpose:
 [[1 3]
 [2 4]]
Determinant: -2.0000000000000004
Inverse:
 [[-2.   1. ]
 [ 1.5 -0.5]]



## 9. Random Module
Used for random sampling, dataset shuffling, and initializing weights in neural networks.


In [9]:

rand_arr = np.random.rand(3, 3)
rand_ints = np.random.randint(0, 10, (2, 3))
normal_dist = np.random.normal(0, 1, 5)

print("Uniform random array:\n", rand_arr)
print("Random integers:\n", rand_ints)
print("Normal distribution samples:", normal_dist)


Uniform random array:
 [[0.7714236  0.64259213 0.29454607]
 [0.47503245 0.33310906 0.23238796]
 [0.07399558 0.88653836 0.39256915]]
Random integers:
 [[1 6 3]
 [5 2 4]]
Normal distribution samples: [-0.73702389 -1.94234703  0.02354877 -0.15559653  0.03609249]



## 10. Real-time Use Case: Data Normalization in ML
Before training ML models, features must be normalized for faster convergence.


In [10]:

data = np.random.randint(0, 255, (3,3))
print("Original Data:\n", data)

# Normalize between 0 and 1
normalized = data / 255.0
print("Normalized Data:\n", normalized)


Original Data:
 [[135  18 192]
 [114 141  32]
 [  2  58  96]]
Normalized Data:
 [[0.52941176 0.07058824 0.75294118]
 [0.44705882 0.55294118 0.1254902 ]
 [0.00784314 0.22745098 0.37647059]]



## 11. Mini Project: Image Data Simulation
Here we simulate an image matrix and perform preprocessing using NumPy.


In [11]:

image = np.random.randint(0, 256, (5,5))
print("Original Image:\n", image)

# Normalize and apply thresholding
norm_img = image / 255.0
thresholded = np.where(norm_img > 0.5, 1, 0)

print("Normalized Image:\n", norm_img)
print("Thresholded Image:\n", thresholded)


Original Image:
 [[186 173  90   5 186]
 [ 73 127 124  23 133]
 [184  83 106  65  66]
 [246  72   2 203 254]
 [  2 110 108 252 193]]
Normalized Image:
 [[0.72941176 0.67843137 0.35294118 0.01960784 0.72941176]
 [0.28627451 0.49803922 0.48627451 0.09019608 0.52156863]
 [0.72156863 0.3254902  0.41568627 0.25490196 0.25882353]
 [0.96470588 0.28235294 0.00784314 0.79607843 0.99607843]
 [0.00784314 0.43137255 0.42352941 0.98823529 0.75686275]]
Thresholded Image:
 [[1 1 0 0 1]
 [0 0 0 0 1]
 [1 0 0 0 0]
 [1 0 0 1 1]
 [0 0 0 1 1]]



## 12. Summary
In this training, we covered:
- Array creation and manipulation.
- Mathematical and linear algebra operations.
- Broadcasting and random number generation.
- Real-world ML preprocessing using NumPy.

NumPy is the foundation of libraries like **Pandas, TensorFlow, and PyTorch**, making it indispensable for any data or AI professional.
