Starting with NumPy, let's understand what it is.

NumPy is a library or a Python package used for various scientific computations. When the array size is large in Python, operating on them becomes very costly. NumPy provides arrays that are vectorized and perform operations on array elements simultaneously (rather than using loops like Python lists).


**What is the difference between NumPy arrays and Python arrays?**

1. NumPy arrays only store **homogeneous elements**. Thus, they store these elements in contiguous memory and do not require storing type information for each element.

2. Due to their **vectorized** nature, NumPy arrays are much faster.

3. The NumPy package provides many **mathematical functions** that can be easily applied to NumPy arrays.

To understand the performance difference between NumPy arrays and Python arrays, let's take this example of adding 2 to each element of an array.

In Numpy:

In [1]:
import numpy as np
a = np.array([1,2,3])
b = a+2
print(b)

[3 4 5]


In Python:

In [2]:
a = [1,2,3]
b = [i+2 for i in a]
print(b)

[3, 4, 5]


NumPy array operations are vectorized, i.e., NumPy applies operations to the whole array at once, without using Python loops. For example, adding 2 to a NumPy array directly adds 2 to each element of the array. Whereas, performing this operation in Python requires a loop, which makes it slower.

**So, is NumPy all about arrays?**

Well… somewhat. We can say that `ndarray` (the core data structure of NumPy) is the backbone — most of NumPy revolves around creating, manipulating, and computing with these arrays.

However, NumPy also offers several other useful features:

1. **Mathematical functions** – Access to a wide range of functions such as trigonometry, statistics, and linear algebra.  
2. **Random number generation** – Using `np.random` to generate random data for simulations and testing.  
3. **Fourier transforms and signal processing** – Not heavily used in standard data science or ML workflows, but available if needed.  
4. **File I/O** – Load and save data in formats like `.npy`, `.csv`, etc.  
5. **Broadcasting** – A powerful way to perform operations on arrays of different shapes (very important in real-world use cases).

Before using NumPy, it’s important to understand how it works and how to handle its arrays.  

I will start by defining NumPy arrays.


### 1. Defining Numpy arrays

To define a Numpy array i can use: either the **python list/tuple**, OR **built-in numpy functions.**

In [3]:
import numpy as np
np.random.seed(0)  # Set seed for reproducibility

1. ndarray from python list/tuple

In [4]:
# From a python list or tuple
a_from_list = np.array([1,2,3])
a_from_list

array([1, 2, 3])

In [5]:
a_from_tuple = np.array((1,2,3))
a_from_tuple

array([1, 2, 3])

2. ndarray using numpy functions, we can create:
   1. all ones/zeroes arrays
   2. all 'a' filled arrays
   3. identity matrix
   4. a range
   5. a 'n' numbers between a and b (linspace)
   6. random numbers

In [6]:
np.ones((2,3)) # shape 2x3 array all filled with 1


array([[1., 1., 1.],
       [1., 1., 1.]])

In [7]:
np.zeros((2,3)) # 2x3 array of zeros

array([[0., 0., 0.],
       [0., 0., 0.]])

In [8]:
np.full((2,3), 7) # 2x3 array all filled with 7

array([[7, 7, 7],
       [7, 7, 7]])

In [9]:
np.eye(3) # identity matrix of shape 3x3

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [10]:
np.arange(0,10,2)  # all values between 0 to 10 with step 2

array([0, 2, 4, 6, 8])

In [11]:
np.linspace(0,10,2) # 2 values between the range 0,10

array([ 0., 10.])

In [12]:
np.random.rand(2,3) # random floats between 0-1 in shape 2x3 array

array([[0.5488135 , 0.71518937, 0.60276338],
       [0.54488318, 0.4236548 , 0.64589411]])

In [13]:
np.random.randint(0,10,(2,5)) # (2,5) random integers between 0-10

array([[4, 7, 6, 8, 8],
       [1, 6, 7, 7, 8]])

We can create a 1D array and reshape it to 2D. 
For example, i created an array using arange:


In [14]:
np.arange(0,10) # 1D

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [15]:
np.arange(0,10).reshape(5,2) # convert 1d to 2d

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

Otherwise, we can directly define the multidimensional arrays with random values, zeroes and ones.

In [16]:
# 1D array
x1= np.random.randint(10, size=6)  # 1D array/tensor of 6 elements of values in range of 0 to 10

# 2D array
x2= np.random.randint(10, size=(3,4))  # 2D array/tensor of shape (3,4)

#3D array
x3 = np.random.randint(10, size=(3,4,5))  #3D array/tensor of shape (3,4,5)

In [17]:
print(f"{x1=}")
print(f"{x2=}")
print(f"{x3=}")

x1=array([1, 5, 9, 8, 9, 4])
x2=array([[3, 0, 3, 5],
       [0, 2, 3, 8],
       [1, 3, 3, 3]])
x3=array([[[7, 0, 1, 9, 9],
        [0, 4, 7, 3, 2],
        [7, 2, 0, 0, 4],
        [5, 5, 6, 8, 4]],

       [[1, 4, 9, 8, 1],
        [1, 7, 9, 9, 3],
        [6, 7, 2, 0, 3],
        [5, 9, 4, 4, 6]],

       [[4, 4, 3, 4, 4],
        [8, 4, 3, 7, 5],
        [5, 0, 1, 5, 9],
        [3, 0, 5, 0, 1]]])


### 2. Numpy array attributes

What are attributes?

When we simply talk about the OOP, the attributes are the variables associated with a class or object, which represent the object's state/data.

But in Numpy when I say attribute, it means the properties of that array.
For example- dimension, size, shape, datatype, byte size of single elements or complete array etc.

To access these properties of a Numpy array, there are some in-built attributes:

1. .ndim  (number of dimensions of array like 1D, 2D etc)
2. .shape (Tuple with number size of array in each dimension, like shape of a tensor)
3. .size  (Total number of elements in an array)
4. .dtype (Data type of the array element- remember ndarray is always homogeneous)
5. .itemsize (Number of bytes consumed by one element of the array)
6. .nbytes ( total number of bytes consumed by the entire array)

This is how these attributes are accessed:

In [18]:
# example ndarray

x = np.random.randint(0,10, size=(3,10))
x

array([[2, 4, 2, 0, 3, 2, 0, 7, 5, 9],
       [0, 2, 7, 2, 9, 2, 3, 3, 2, 3],
       [4, 1, 2, 9, 1, 4, 6, 8, 2, 3]])

In [19]:
x.ndim # 2 dimensional

2

In [20]:
x.shape  # array shape

(3, 10)

In [21]:
x.size # number of elements

30

In [22]:
x.dtype # element type

dtype('int64')

In [23]:
x.itemsize # byte size of each element

8

In [24]:
x.nbytes  # total bytes used by x

240

### 3. Access Array Elements
We created the array, and analyzed its properties. Now I will show you how to access the elements of ndarray.

It is similar to how we access elements of a list in python, using square brackets `[]`. But syntax differs slightly:

| **Operation** | **Python List** | **NumPy Array**         |
|---------------|------------------|--------------------------|
| Access 1D     | `a[2]`           | `a[2]`                   |
| Access 2D     | `a[1][2]`        | `a[1, 2]` *(cleaner)*    |
| Slicing       | `a[1:4]`         | `a[1:4]`                 |


In [25]:
# 1D array elements

x1 = np.array([5, 0, 3, 3, 7, 9])
assert x1[0] == 5
assert x1[-1] == 9
assert x1[-2] == 7

In [26]:
# Multidimensional array elements
x2 = np.array([[3, 5, 2, 4],
               [7, 6, 8, 8],
               [1, 6, 7, 7]])

assert x2[-1,0] == 1
assert x2[-2,3] == 8


- We can modify values of array using index.

In [27]:
x3[2,2,3] = 1000
x3

array([[[   7,    0,    1,    9,    9],
        [   0,    4,    7,    3,    2],
        [   7,    2,    0,    0,    4],
        [   5,    5,    6,    8,    4]],

       [[   1,    4,    9,    8,    1],
        [   1,    7,    9,    9,    3],
        [   6,    7,    2,    0,    3],
        [   5,    9,    4,    4,    6]],

       [[   4,    4,    3,    4,    4],
        [   8,    4,    3,    7,    5],
        [   5,    0,    1, 1000,    9],
        [   3,    0,    5,    0,    1]]])

**NOTE: Unlike python list, the datatype of numpy array is fixed. Means if we try to input float value in int64 type array, it will truncate the float value to int, without any WARNING.** 

See Example:

In [28]:
x3[2,2,3] = 1000.123
x3

array([[[   7,    0,    1,    9,    9],
        [   0,    4,    7,    3,    2],
        [   7,    2,    0,    0,    4],
        [   5,    5,    6,    8,    4]],

       [[   1,    4,    9,    8,    1],
        [   1,    7,    9,    9,    3],
        [   6,    7,    2,    0,    3],
        [   5,    9,    4,    4,    6]],

       [[   4,    4,    3,    4,    4],
        [   8,    4,    3,    7,    5],
        [   5,    0,    1, 1000,    9],
        [   3,    0,    5,    0,    1]]])

### 4. Array Slicing
- Similar to python lists. 
```python 
   Syntax: x[start:stop:step]

In [29]:
x = np.arange(10)  # Consider single entry as 0 to stop with default step 1
x

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [30]:
# 1. 1D Subarray

assert list(x[:5]) == [0, 1, 2, 3, 4]
assert list(x[5:]) == [5, 6, 7, 8, 9]
assert list(x[4:7]) == [4,5,6]
assert list(x[1:9:2]) == [1,3,5,7]
assert list(x[1::3]) == [1,4,7]


In [31]:
# 2. Negative Step values -- Reverses the array

assert list(x[:-1]) == [0, 1, 2, 3, 4, 5, 6, 7, 8]
assert list(x[-3:1:-2]) == [7,5,3] 

# Try to understand start and step
print(x[-3:1:-1])
print(x[-3:1:1])


[7 6 5 4 3 2]
[]


In [34]:
# 3. Multidim Subarrays - Similar, just separated by commas
x2 = np.random.randint(12, size=(3,4))
x2


array([[ 3,  2, 11,  0],
       [ 8,  8,  3,  8],
       [10,  2,  8,  4]])

In [None]:
x2[0, :] # first row

array([ 3,  2, 11,  0])

In [None]:
x2[:, 2] # second column

array([11,  3,  8])

In [37]:
x2[0:2, 1:3] # row 0,1 and col 1,2

array([[ 2, 11],
       [ 8,  3]])

### 5. SubArray copies - Important to note

- Just like with 1D arrays, updating a subarray in a multi-dimensional NumPy array also updates the original array.  
- This happens because slicing in NumPy returns a **view**, not a **copy**.

- ✅  To modify the subarray **without affecting** the original array, create an explicit copy using:

  ```python
  sub = array[i:j, m:n].copy()
```

In [38]:
x2

array([[ 3,  2, 11,  0],
       [ 8,  8,  3,  8],
       [10,  2,  8,  4]])

In [39]:
# No-Copy views
x2_sub = x2[:2, :2]  
x2_sub[0,0] = 1000
x2  # Original array changed through subarray

array([[1000,    2,   11,    0],
       [   8,    8,    3,    8],
       [  10,    2,    8,    4]])

In [40]:
# If we don't want to change original array
np.random.seed(0)
x3 = np.random.randint(12, size=(3,4)) # Same x_2d
x3_sub = x3[:2,:2].copy()
x3_sub[0,0] = 1000  
x3   # Didn't change original array

array([[ 5,  0,  3, 11],
       [ 3,  7,  9,  3],
       [ 5,  2,  4,  7]])

### 6. Reshaping of Arrays
We have already seen before how to create a multidimensional array from one-dimensional array in section 1 using reshape().
There are multiple ways to reshape an ndarray.

1. using `reshape()`

- Always match the size of array and reshape size before converting, otherwise get error.



In [41]:
# Create a 1D array
grid = np.arange(1,10)
assert grid.ndim == 1

grid

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

In [42]:
# Reshape to 2D
grid = grid.reshape(3,3)
assert grid.ndim == 2

grid

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [43]:
# Change to 1D
grid = grid.reshape(9)
assert grid.ndim == 1

grid

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

2. `squeeze()` - To remove axes whose dimensions are 1.

In [44]:
a = np.array([1,2,3]).reshape(3,1,1)
a

array([[[1]],

       [[2]],

       [[3]]])

In [45]:
a_sq = a.squeeze() # removed all axes with dimension 1

a_sq.shape

(3,)

3. use `exapnd_dim` to add new axis (which is unsqueeze in torch)

In [46]:
a = np.array([1,2,3])
a.shape

(3,)

In [47]:
b = np.expand_dims(a, axis =1) # add one axis in dim[1]
b.shape 

(3, 1)

4. add dimension using `np.newaxis` keyword

- similar to `expand_dim` we can use np.newaxis to add a dim in row or in col.

In [48]:
# 1. Create a row vector using reshape

x= np.array([1,2,3])
x.shape # 3 elements

(3,)

In [49]:
# make it 1 row of elements
x[np.newaxis, :].shape


(1, 3)

In [50]:
# make it 3 col of one element
x[:, np.newaxis].shape

(3, 1)

# 7. Array Concatenation

Till now we already learned how to define a ndarray, check its properties(using attributes), access/modify its elements and how to slice an array. Now we will learn how to combine multiple ndarrays. 

- For 1D array, concatenation will result in 1D array only
- But For multidimensional array, concatenation can be performed in two ways:
  1. row wise (i.e. vertical concatenation)
  2. column  wise (i.e. horizontal concatenation)

This table summarize all concatenation types.

| Function            | Description                          | Adds Axis? |
|---------------------|--------------------------------------|------------|
| `np.concatenate()`  | Join along existing axis             | ❌         |
| `np.vstack()`       | Stack vertically (row-wise)          | ❌ (axis=0) |
| `np.hstack()`       | Stack horizontally (column-wise)     | ❌ (axis=1) |
| `np.stack()`        | Stack along a **new axis**           | ✅         |
| `np.column_stack()` | Stack 1D arrays as columns in 2D     | ✅         |




**1D array**
1. `np.concatenate` : join two arrays in existing axis.

In [51]:
a = np.array([1,2])
b = np.array([3,4])
np.concatenate([a,b])

array([1, 2, 3, 4])

**Multidimensional array**  
2. `vstack` and `hstack`

- vstack: stack two arrays in rows(axis = 0) 
- hstack: stcak two arrays in columns(axis=1)

In [52]:
np.vstack([a,b]) 

array([[1, 2],
       [3, 4]])

In [53]:
np.hstack([a,b])

array([1, 2, 3, 4])

### 8. Array Splitting

``` python
- np.split
- np.vsplit
- np.hsplit
- np.dsplit

**1D array**

`np.split`

In [54]:
# 1. numpy.split(ary, indices_or_sections, axis=0)

np.random.seed(0)
x = np.array([5, 0, 3, 3, 7, 9, 3, 5, 2, 4])
x1, x2, x3 = np.split(x,[2,6])              # Split at index 2 and 6
assert list(x1) == [5,0]
assert list(x2) == [3,3,7,9]
assert list(x3) == [3,5,2,4]

**Multidimensional Array**

- vsplit is split with axis=0.
- hsplit is split with axis=1.
- dsplit is to split 3D array on third axis.
``` python
- np.vsplit
- np.hsplit
- no.dsplit

In [55]:
grid = np.arange(5,21).reshape(4,4)
grid

array([[ 5,  6,  7,  8],
       [ 9, 10, 11, 12],
       [13, 14, 15, 16],
       [17, 18, 19, 20]])

In [56]:
np.vsplit(grid, [3])  # Split at row index 3

[array([[ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16]]),
 array([[17, 18, 19, 20]])]

In [57]:
np.hsplit(grid,[3])  # Split at column index 3

[array([[ 5,  6,  7],
        [ 9, 10, 11],
        [13, 14, 15],
        [17, 18, 19]]),
 array([[ 8],
        [12],
        [16],
        [20]])]