# NumPy – Advanced End-to-End Tutorial (With Detailed Explanations)

This notebook is designed as a **complete teaching + reference material** for NumPy.

### How to use this notebook
- Every concept is explained **in detail** using Markdown
- Each explanation is followed by **one focused code cell**
- The code remains **simple, clean, and reusable**
- Ideal for **classroom teaching, self-study, and revision**

---


## 1. Introduction to NumPy

### Explanation
NumPy (Numerical Python) is the **core library for numerical computation in Python**.
It introduces the powerful `ndarray` object, which allows:
- Fast mathematical computation
- Vectorized operations (no loops)
- Efficient memory usage
- Multi-dimensional data handling

NumPy is the **foundation** for libraries like Pandas, Matplotlib, Scikit-learn, TensorFlow, and PyTorch.


In [1]:
import time
import sys
import numpy as np

SIZE = 1_000_000

# Create data structures
py_list = list(range(SIZE))
py_tuple = tuple(range(SIZE))
np_array = np.arange(SIZE)

print("====== MEMORY SIZE COMPARISON ======")

# Memory size
list_size = sys.getsizeof(py_list) + sum(sys.getsizeof(i) for i in py_list)
tuple_size = sys.getsizeof(py_tuple) + sum(sys.getsizeof(i) for i in py_tuple)
numpy_size = np_array.nbytes

print(f"List memory size   : {list_size / (1024*1024):.2f} MB")
print(f"Tuple memory size  : {tuple_size / (1024*1024):.2f} MB")
print(f"NumPy memory size  : {numpy_size / (1024*1024):.2f} MB")

print("\n====== SPEED COMPARISON ======")

# -------- List --------
start = time.perf_counter()
list_result = [x * 2 for x in py_list]
end = time.perf_counter()
print(f"List time  : {end - start:.6f} seconds")

# -------- Tuple --------
start = time.perf_counter()
tuple_result = tuple(x * 2 for x in py_tuple)
end = time.perf_counter()
print(f"Tuple time : {end - start:.6f} seconds")

# -------- NumPy --------
start = time.perf_counter()
numpy_result = np_array * 2
end = time.perf_counter()
print(f"NumPy time : {end - start:.6f} seconds")


List memory size   : 34.33 MB
Tuple memory size  : 34.33 MB
NumPy memory size  : 7.63 MB

List time  : 0.015069 seconds
Tuple time : 0.026419 seconds
NumPy time : 0.001038 seconds


In [2]:
import numpy as np
print("NumPy Version:", np.__version__)

NumPy Version: 2.1.3


## 2. Creating NumPy Arrays – Multiple Approaches

### Explanation
NumPy provides multiple ways to create arrays depending on the use case:
- From Python lists
- Pre-filled arrays (zeros, ones)
- Ranged data
- Evenly spaced values

Understanding array creation is critical for **data preparation and modeling**.


In [None]:

a1 = np.array([1, 2, 3])
a2 = np.array([[1, 2], [3, 4]])
a3 = np.zeros((2, 3))
a4 = np.ones((3, 3))
a5 = np.arange(0, 20, 4)
a6 = np.linspace(1, 10, 5)

print(a1)
print(a2)
print(a3)
print(a4)
print(a5)
print(a6)


## 3. Array Attributes & Structure

### Explanation
Every NumPy array has metadata that describes:
- **Dimensions (`ndim`)** – how many axes
- **Shape** – size of each dimension
- **Size** – total number of elements
- **Data type (`dtype`)** – type of data stored

These attributes are heavily used in **machine learning pipelines**.


In [None]:

arr = np.array([[10, 20, 30], [40, 50, 60]])
print("ndim:", arr.ndim)
print("shape:", arr.shape)
print("size:", arr.size)
print("dtype:", arr.dtype)


## 4. Reshaping & Flattening Arrays

### Explanation
Reshaping allows changing the **structure of data without changing the values**.
Flattening converts multi-dimensional arrays into a single dimension.

This is extremely common while:
- Preparing ML datasets
- Converting matrices to vectors


In [None]:

arr = np.arange(1, 13)
reshaped = arr.reshape(3, 4)
flattened = reshaped.flatten()

print("Reshaped Array:\n", reshaped)
print("Flattened Array:", flattened)


## 5. Basic & Vectorized Operations

### Explanation
NumPy performs **element-wise operations automatically**, known as vectorization.
This eliminates explicit loops and makes the code:
- Shorter
- Faster
- More readable


In [None]:

a = np.array([1, 2, 3])
b = np.array([10, 20, 30])

print(a + b)
print(a * b)
print(a / b)
print(a ** 2)


## 6. Mathematical & Statistical Functions

### Explanation
NumPy includes built-in mathematical and statistical functions that:
- Operate on entire arrays
- Can be applied row-wise or column-wise using `axis`
- Are optimized for performance


In [None]:

data = np.array([[10, 20, 30], [40, 50, 60]])

print("Total Sum:", np.sum(data))
print("Row-wise Sum:", np.sum(data, axis=1))
print("Column-wise Mean:", np.mean(data, axis=0))
print("Standard Deviation:", np.std(data))
print("Minimum:", np.min(data))
print("Maximum:", np.max(data))


## 7. Universal Functions (ufuncs)

### Explanation
Universal functions (ufuncs) apply **element-wise mathematical operations**.
They work much faster than traditional Python loops.

Examples include square root, logarithm, trigonometric functions, etc.


In [None]:

x = np.array([1, 4, 9, 16])

print(np.sqrt(x))
print(np.log(x))
print(np.exp(x))
print(np.sin(x))
print(np.cos(x))


## 8. Indexing & Slicing (1D and 2D)

### Explanation
Indexing allows accessing individual elements.
Slicing allows extracting **subarrays**.

These operations are essential for **data selection and preprocessing**.


In [None]:

arr1 = np.array([10, 20, 30, 40, 50])
arr2 = np.array([[1, 2, 3], [4, 5, 6]])

print(arr1[1:4])
print(arr2[0, 1])
print(arr2[:, 1])
print(arr2[1, :])


## 9. Advanced Indexing – Boolean Indexing

### Explanation
Boolean indexing allows filtering arrays using conditions.
This is widely used in:
- Data analysis
- Cleaning datasets
- Feature selection


In [7]:

arr = np.array([10, 25, 30, 45, 60, 75])

print(arr[arr > 30])
print(arr[(arr > 20) & (arr < 60)])
print(arr[arr % 2 == 0])


[45 60 75]
[25 30 45]
[10 30 60]


## 10. Advanced Indexing – Fancy Indexing

### Explanation
Fancy indexing uses **arrays or lists of indices** to extract elements.
It supports both 1D and 2D arrays.


In [None]:

arr = np.array([100, 200, 300, 400, 500])
indices = [0, 2, 4]
print(arr[indices])

matrix = np.array([[10, 20, 30], [40, 50, 60]])
row_indices = [0, 1]
col_indices = [1, 2]
print(matrix[row_indices, col_indices])


## 11. Iterating Over Arrays

### Explanation
Iteration allows traversing elements of an array.
Although NumPy prefers vectorized operations, iteration is sometimes needed.


In [None]:

arr = np.array([[1, 2], [3, 4]])

for row in arr:
    print(row)

for x in np.nditer(arr, order='F'):
    print(x)


## 12. Joining / Concatenating Arrays

### Explanation
Concatenation combines multiple arrays along a specified axis.
Understanding `axis` is critical for matrix operations.


In [None]:

a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])

print(np.concatenate((a, b), axis=0))
print(np.concatenate((a, b), axis=1))


## 13. Splitting Arrays

### Explanation
Splitting divides an array into multiple sub-arrays.
Useful for batching and cross-validation.


In [8]:

arr = np.arange(1, 13)

print(np.split(arr, 3))
print(np.array_split(arr, 5))


[array([1, 2, 3, 4]), array([5, 6, 7, 8]), array([ 9, 10, 11, 12])]
[array([1, 2, 3]), array([4, 5, 6]), array([7, 8]), array([ 9, 10]), array([11, 12])]


## 14. Searching in Arrays

### Explanation
Searching is used to find indices based on conditions or sorted positions.


In [None]:

arr = np.array([10, 20, 30, 40, 50])

print(np.where(arr > 25))
print(np.where(arr % 20 == 0))
print(np.searchsorted(arr, [15, 35]))


## 15. Sorting Arrays

### Explanation
Sorting arranges elements in a specific order.
Can be applied row-wise or column-wise in 2D arrays.


In [None]:

arr1 = np.array([3, 1, 4, 2])
arr2 = np.array([[3, 2, 1], [6, 5, 4]])

print(np.sort(arr1))
print(np.sort(arr2, axis=1))
print(np.sort(arr2, axis=0))


## 16. Filtering Using Masks

### Explanation
Filtering extracts values using boolean masks.
This is a core concept in **data wrangling**.


In [None]:

arr = np.array([5, 10, 15, 20, 25, 30])
mask = (arr >= 10) & (arr <= 25)
print(arr[mask])


## 17. Stacking Arrays

### Explanation
Stacking combines arrays along different dimensions:
- vstack → vertical
- hstack → horizontal
- dstack → depth


In [None]:

a = np.array([[1, 2]])
b = np.array([[3, 4]])

print(np.vstack((a, b)))
print(np.hstack((a, b)))
print(np.dstack((a, b)))


## 18. Broadcasting

### Explanation
Broadcasting allows NumPy to perform operations on arrays of **different shapes**
without copying data.


In [None]:

a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([10, 20, 30])

print(a + b)

c = np.array([[1], [2], [3]])
print(c + np.array([10, 20, 30]))


## 20. Final Summary

You have now learned:
- NumPy fundamentals
- Advanced indexing techniques
- Array manipulation
- Broadcasting logic

This knowledge is **mandatory** for Data Science, ML, and scientific programming.
