### **Understanding Numpy**

- NumPy (Numerical Python) is the foundation library for scientific computing in Python. It provides a powerful N-dimensional array object and tools for working with these arrays

- Think of NumPy as the engine that powers most data science libraries - pandas uses NumPy arrays internally, scikit-learn expects NumPy arrays for machine learning, and matplotlib uses NumPy for plotting.

- NumPy operations are implemented in C, making them 10-100x faster than pure Python

- NumPy arrays store data more compactly than Python lists

- Vectorization: Perform operations on entire arrays without writing loops

- Work with arrays of different shapes seamlessly

- Foundation for pandas, scikit-learn, matplotlib, and more

In [39]:
# import all necessary libraries

import numpy as np
import matplotlib.pyplot as plt
import time

# Check NumPy version
print(f"NumPy version: {np.__version__}")

# Display settings for cleaner output
np.set_printoptions(precision=3, suppress=True)

NumPy version: 2.3.3


**Creating Numpy Arrays**

In [40]:
# Creating arrays from Python lists
# ID array: A simple sequence of numbers
arr1d = np.array([1, 2, 3, 4, 5])

# 2D array: Think of this as a matrix or table with rows and columns
arr2d = np.array([[1, 2, 3],
                  [4, 5, 6]])

# 3D array: Like a stack of 2D arrays - useful for images, time series, etc.
arr3d = np.array([[[1, 2], [3, 4]],
                  [[5, 6], [7, 8]]])

print("1D array:", arr1d)
print("2D array:\n", arr2d)
print("3D array:\n", arr3d)

1D array: [1 2 3 4 5]
2D array:
 [[1 2 3]
 [4 5 6]]
3D array:
 [[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]


**Creating Special Arrays in Numpy**

In [41]:
# Creating arrays filled with zeroes - useful for initializing arrays
# Shape (3, 4) means 3 rows and 4 columns
zeroes = np.zeros((3, 4))

# Creating arrays filled with ones - often used as starting points
ones = np.ones((2, 3, 4))     # 3D array: 2 layers, 3 rows, 4 columns

# Empty array - faster than zeros/ones but contains random values 
# Use when you'll immediately fill the array with real data
empty = np.empty((2, 2))

print("Zeros array(3X4):\n", zeroes)
# print("Ones array shape:", ones)
print("Ones array shape:", ones.shape)
print("Empty array (contains random values):\n", empty)

Zeros array(3X4):
 [[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
Ones array shape: (2, 3, 4)
Empty array (contains random values):
 [[0. 0.]
 [0. 0.]]


In [42]:
# Range arrays - like Python's range() but more powerful
range_arr = np.arange(0, 10, 2)   # Start, stop, step: [0, 2, 4, 6, 8]
print("Range array:", range_arr)

# Linearly spaced arrays - divide a range into equal parts
# From 0 to 1 with exactly 5 points (including endpoints)
linspace_arr = np.linspace(0, 1, 5)
print("Linspace array: ", linspace_arr)

# Logarithmically spaced arrays - useful for scientific data
# From 10^0 to 10^2 (1 to 100) with 5 points
logspace_arr = np.logspace(0, 2, 5)
print("Logspace array: ", logspace_arr)


Range array: [0 2 4 6 8]
Linspace array:  [0.   0.25 0.5  0.75 1.  ]
Logspace array:  [  1.      3.162  10.     31.623 100.   ]


In [43]:
# Identity matrix - diagonal of ones, zeroes elsewhere
# Essential for linear algebra operations
identity = np.eye(4)        # 4x4 identity matrix

# Diagonal matrix - put values on the diagonal
diagonal = np.diag([1, 2, 3, 4])

# Array filled with a specific value
full_arr = np.full((3, 3), 7)       # 3x3 array filled with 7

print("Identity matrix:\n", identity)
print("Diagonal matrix:\n", diagonal)
print("Full array (filled with 7):\n", full_arr)

Identity matrix:
 [[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]
Diagonal matrix:
 [[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]]
Full array (filled with 7):
 [[7 7 7]
 [7 7 7]
 [7 7 7]]


**Numpy Data Types(dtypes)**

In [44]:
# Explicit data types - control memory usage and precisin
int_arr = np.array([1, 2, 3], dtype=np.int32)       # 32-bit integers
float_arr = np.array([1, 2, 3], dtype=np.float64)   # 64-bit floats (double precision)
bool_arr = np.array([True, False, True], dtype=np.bool_)    # Boolean values

# Type conversion - change dtype of existing array
converted = int_arr.astype(np.float32)  # Convert to 32-bit float

print("Integer array dtype:", int_arr.dtype)
print("Float array dtype:", float_arr.dtype)
print("Boolean array dtype:", bool_arr.dtype)
print("Converted array dtype:", converted.dtype)

# Memory usage comparison
print(f"int32 uses {int_arr.itemsize} bytes per element")
print(f"float64 uses {float_arr.itemsize} bytes per element")

Integer array dtype: int32
Float array dtype: float64
Boolean array dtype: bool
Converted array dtype: float32
int32 uses 4 bytes per element
float64 uses 8 bytes per element


**N.B: Choose smaller types to save memory**

**N.B: Choose larger types for precision**

int8 - 1 byte

int32 - 4 bytes

float32 - 4 bytes with 7 decimal digits precision

float64 - 8 bytes with 15 decimal digits precision

**Array Properties & Attributes**

In [45]:
# Array of 3 layers, each with 4 rows and 5 columns
arr = np.random.randn(3, 4, 5)

# Shape: the dimensions of the array (layers, rows, columns)
print("Shape:", arr.shape)

# Size: Total number of elements(3 x 4 x 5 = 60)
print("Size:", arr.size)

# Ndim: Number of dimensions (3D in this case)
print("Ndim:", arr.ndim)

# Dtype: Data type of elements
print("Dtype:", arr.dtype)

# Itemsize: Memory size of each element in bytes
print("Itemsize:", arr.itemsize)    # 8 bytes for float64

# Total memory usage in bytes
print("Memory usage:", arr.nbytes, "bytes")  # size x itemsize
print("Memory usage:", arr.nbytes / 1024, "KB")   # Convert to KB

arr

Shape: (3, 4, 5)
Size: 60
Ndim: 3
Dtype: float64
Itemsize: 8
Memory usage: 480 bytes
Memory usage: 0.46875 KB


array([[[ 0.336, -1.173, -0.058,  1.612,  0.002],
        [-0.339,  0.999,  1.196,  0.172, -0.36 ],
        [ 0.887,  0.895, -1.81 , -0.033,  0.26 ],
        [ 1.198,  1.68 , -1.877, -2.358,  0.392]],

       [[ 0.131,  0.843,  0.989,  2.017,  0.65 ],
        [ 0.215,  1.292,  0.105,  0.442, -1.396],
        [ 0.179,  0.599,  0.18 ,  1.752,  0.807],
        [-2.145,  1.652, -0.291,  0.321,  0.758]],

       [[ 1.553, -0.149, -0.114,  1.913, -0.675],
        [ 0.223,  0.083, -0.05 , -0.464,  0.538],
        [-0.238,  0.478, -0.482, -0.466, -0.118],
        [ 1.637, -0.932, -0.861,  0.204, -0.636]]])

**Array Indexing & Slicing**

**Basic Indexing - Accessing Individual Elements**

In [46]:
arr1d = np.array([10, 20, 30, 40, 50])

print("First element:", arr1d[0])   # Index 0: 10
print("Last element:", arr1d[-1])   # Negative indexing: 50
print("Slice [1:4]:", arr1d[1:4])   # Elements 1, 2, 3: [20, 30, 40]
print("Every 2nd element:", arr1d[::2]) # Step of 2: [10, 30, 50]

First element: 10
Last element: 50
Slice [1:4]: [20 30 40]
Every 2nd element: [10 30 50]


In [47]:
# 2D array indexing - row and column access
arr2d = np.array([[1, 2, 3, 4],
                  [5, 6, 7, 8],
                  [9, 10, 11, 12]])

# Access specific element: [row, column]
print("Element at row 1, column 2:", arr2d[1, 2])

# Access entire rows or columns
print("First row:", arr2d[0, :])        # All columns of row 0
print("Second column:", arr2d[:, 1])    # All rows of column 1

# Subarray slicing: [row_start:row_end, col_start:col_end]
print("Subarray (rows 1-2, cols 1-2):\n", arr2d[1:3, 1:3])

arr2d

Element at row 1, column 2: 7
First row: [1 2 3 4]
Second column: [ 2  6 10]
Subarray (rows 1-2, cols 1-2):
 [[ 6  7]
 [10 11]]


array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

**Advanced Indexing - Powerful Selection Methods**

In [48]:
# Fancy indexing - use arrays of indices to select elements
arr = np.array([10, 20, 30, 40, 50])
indices = np.array([0, 2, 4])   # Select elements at positions 0, 2, 4
print("Fancy indexing:", arr[indices])

# This is much more flexible than simple slicing
random_indices = np.array([4, 1, 3, 1])     # Can repeat and reorder
print("Random order:", arr[random_indices])

Fancy indexing: [10 30 50]
Random order: [50 20 40 20]


In [49]:
# 2D fancy indexing 
arr2d = np.arange(12).reshape(3, 4) # 3x4 array:
print("Original 2D array:\n", arr2d)

# Select elements at (row, col) pairs: (0, 1) and (2, 3)
rows = np.array([0, 2])
cols = np.array([1, 3])
print("Elements at (0,1) and (2,3):", arr2d[rows, cols])

# Select entire rows using fancy indexing
selected_rows = arr2d[[0, 2], :]
print("Selected rows:\n", selected_rows)

Original 2D array:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Elements at (0,1) and (2,3): [ 1 11]
Selected rows:
 [[ 0  1  2  3]
 [ 8  9 10 11]]


**Array Reshaping & Manipulation**

In [57]:
# Start with a 1D array
arr = np.arange(12)     # [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]
print("Original 1D array:", arr)

# Reshape to 2D: 3 rows x 4 columns
reshaped_2d = arr.reshape(3, 4)
print("Reshaped to 3x4:\n", reshaped_2d)

# Reshape to 3D: 2 layers x 2 rows x 3 columns
reshaped_3d = arr.reshape(2, 2, 3)
print("Reshaped to 2x2x2:\n", reshaped_3d)

# Use -1 to let NumPy calculate one dimension automatically
auto_reshape = arr.reshape(4, -1)   # 4 rows, NumPy calculates columns
print("Auto-reshaped to 4x?\n", auto_reshape)

Original 1D array: [ 0  1  2  3  4  5  6  7  8  9 10 11]
Reshaped to 3x4:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Reshaped to 2x2x2:
 [[[ 0  1  2]
  [ 3  4  5]]

 [[ 6  7  8]
  [ 9 10 11]]]
Auto-reshaped to 4x?
 [[ 0  1  2]
 [ 3  4  5]
 [ 6  7  8]
 [ 9 10 11]]


In [62]:
# Reshaping creates a view, not a copy
# Flattening - convert multi-dimensional array to 1D
arr2d = np.array([[1, 2, 3], [4, 5, 6]])

# flatten() always returns a copy
flattened = arr2d.flatten()
print("Flattened (copy):", flattened)

# ravel() returns a view if possible (faster, memory efficient)
ravel = arr2d.ravel()
print("Ravel (view if possible):", ravel)

# Demonstrate the difference
arr2d[0, 0] = 999
print("After modifying original:")
print("Flattened (unchanged):", flattened)
print("Ravel (changed):", ravel)

Flattened (copy): [1 2 3 4 5 6]
Ravel (view if possible): [1 2 3 4 5 6]
After modifying original:
Flattened (unchanged): [1 2 3 4 5 6]
Ravel (changed): [999   2   3   4   5   6]


**Transposing and Swapping Axes**

In [None]:
# 2D transposition - flip rows and columns
arr2d = np.array([[1, 2, 3],
                  [4, 5, 6]])
print("Original shape:", arr2d.shape)   # (2, 3)
print("Original:\n", arr2d) 

pr

Original shape: (2, 3)
Original:
 [[1 2 3]
 [4 5 6]]
