# NumPy Essentials

### Basic Array Creation:

**Note:** In NumPy, the term "tensor" is often used to refer to arrays of any dimension. A 1D array is a vector, a 2D array is a matrix, and higher-dimensional arrays are called tensors.

In [2]:
import numpy as np

a = np.array([1,2,3])               # 1D array/tensor  [1 2 3]

b = np.array([[1,2,3], [4,5,6]])    # 2D array/tensor  [[1 2 3]
                                    #           [4 5 6]]

c = np.zeros((3,3,3))               # 3D tensor of zeroes
                                    # [[[0. 0. 0.]
                                    #   [0. 0. 0.]
                                    #   [0. 0. 0.]]

                                    #  [[0. 0. 0.]
                                    #   [0. 0. 0.]
                                    #   [0. 0. 0.]]

                                    #  [[0. 0. 0.]
                                    #   [0. 0. 0.]
                                    #   [0. 0. 0.]]]


d = np.ones((2, 4))                 # 2D tensor of ones
                                    # [[1. 1. 1. 1.]
                                    #  [1. 1. 1. 1.]]

e = np.eye(3)                       # identity matrix
                                    # [[1. 0. 0.]
                                    #  [0. 1. 0.]
                                    #  [0. 0. 1.]]


f = np.random.random((2, 2))         # random values
                                    # [[0.37699864 0.25810566]
                                    #  [0.037246   0.45232324]]



# Array attributes

print(a.ndim)                       # number of dimensions 
                                        # prints 1

print(b.shape)                      # shape of array
                                        # prints (2, 3). 2 for number of arrays, 3 for number of elements per array

print(c.size)                       # total number of elements
                                        # prints 27

print(d.dtype)                      # data type
                                        # prints float64

print(e.itemsize)                   # size in bytes of each element
                                        # prints 8 (recall: 1 byte = 8 bits. so 8 bytes = 64 bits)

print(f.nbytes)                     # total bytes consumed by array

1
(2, 3)
27
float64
8
32


### Memory Layout: 
how tensor data is physically arranged in computer memory.

In [3]:
x = np.array([[1,2,3], [4,5,6]], order='C')
print(x.flags.c_contiguous)     # Is it C-contiguous? True


True


In [4]:
y = x = np.array([[1,2,3], [4,5,6]], order='F')
print(y.flags.f_contiguous)     # Is it Fortran contiguous? 

True


**C order** means rows are stored one after another (row-major). **Fortran order** means columns are stored one after another (column-major). This affects how data is laid out in memory and can impact performance for some operations.

### Basic Operations

In [5]:
# Element-wise operations
a = np.array([1,2,3])
b = np.array([4,5,6])

print(a+b)                  # addition [5, 7, 9]
print(a * b)                # element-wise multiplication [4, 10, 18]
print(np.dot(a,b))          # dot product/sum(a*b) : 32

[5 7 9]
[ 4 10 18]
32


In [6]:
# Matrix operations
A = np.array([[1,2], [3,4]])
B = np.array([[5,6], [7,8]])

print(A @ B)                # matrix multiplication
print("\n")
print(np.matmul(A,B))       # same as above

[[19 22]
 [43 50]]


[[19 22]
 [43 50]]


#### Vectorized Operations

In [7]:
a = np.array([1,2,3])
print(a + 5)            # adds 5 to each element in the array

[6 7 8]


Vectorized operations in NumPy are fast because they are implemented in compiled C code under the hood, allowing operations on entire arrays without explicit Python loops. This leads to significant speedups for large data.

In [8]:
batch = np.random.randn(32, 10, 10)     # batch of 32 10x10 matrices
result = np.sum(batch, axis=0)          # sum across batch dimension

print(result)

[[-8.42345984e+00 -1.01916172e+01 -7.45918410e-01  3.96268543e+00
  -5.72833093e+00  2.91438727e+00  4.06279948e+00  4.12523109e+00
   3.00733531e-01  9.78858080e-02]
 [ 1.59576119e+00  3.26355065e+00 -1.06319812e+01  4.70533281e+00
  -1.48214898e+00 -2.52620053e+00  2.65741680e-01  4.16388773e-01
   9.22887221e+00  2.71577359e+00]
 [-9.66200392e-01 -2.17186285e+00  6.75360270e+00  5.99673669e+00
  -9.85259070e+00 -4.46350024e+00 -3.23534501e+00 -3.02228031e+00
   6.64894340e+00 -2.07698746e-01]
 [ 4.43036087e+00 -3.90701380e+00 -4.51090876e+00  4.16867809e+00
  -5.70280260e-01  1.95887322e+00 -1.60934887e-01 -1.66273426e+00
   2.61067655e+00 -2.44358783e+00]
 [ 3.51765638e+00  4.65760575e+00 -8.96849909e-01 -1.06300686e+00
  -5.01294087e+00  3.82284722e+00  3.77360233e+00  2.68585738e+00
   8.32719266e+00 -9.34176480e+00]
 [-3.47242746e+00 -3.58623833e+00 -9.61253561e-01  1.39347319e+00
   1.32623086e+00 -1.41734816e+00 -3.96135494e+00 -3.41007842e+00
  -4.21338940e+00 -7.24046272e+00

#### Memory Views and Slicing


- A **view** is a reference/alias of an object (point to same location in memory). Changes to the original object change the view and vice versa.

For example, 

```
my_array = np.arange(10)
print(my_array)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Take a portion of the array from index 2 to index 5 (not including 5)
slice_of_array = my_array[2:5]
print(slice_of_array)  # [2, 3, 4]
```

This creates a "view" - which means slice_of_array is looking at the same data as my_array, just a smaller portion of it.

In [9]:
# Slicing creates views, not copies

a = np.arange(10)           # an array 0-9 inclusive. [0 1 2 3 4 5 6 7 8 9]
b = a[2:5]                  # [2 3 4]
b[0] = 99                   # [99 3 4]
print(a)                    # [ 0  1 99  3  4  5  6  7  8  9]


# Explicit copy

c = a[2:5].copy()
c[0] = 42                   # This doesn't affect a
print(a)                    # [0, 1, 99, 3, 4, 5, 6, 7, 8, 9]

[ 0  1 99  3  4  5  6  7  8  9]
[ 0  1 99  3  4  5  6  7  8  9]


In [10]:
# EXAMPLE OF VIEWS

# Original array
original = np.array([10, 20, 30, 40, 50])

# Create two different views of the same array
view1 = original[1:4]      # [20, 30, 40]
view2 = original[::2]      # [10, 30, 50]

# Modify through the first view
view1[0] = 999             # Changes original[1] to 999

print("After modifying through view1:")
print("original:", original)  # [10, 999, 30, 40, 50]
print("view1:", view1)        # [999, 30, 40]
print("view2:", view2)        # [10, 30, 50]  (view2[1] not affected as it's original[2])

# Modify through the original
original[2] = 888

print("\nAfter modifying through original:")
print("original:", original)  # [10, 999, 888, 40, 50]
print("view1:", view1)        # [999, 888, 40]
print("view2:", view2)        # [10, 888, 50]  (view2[1] is affected as it's original[2])

After modifying through view1:
original: [ 10 999  30  40  50]
view1: [999  30  40]
view2: [10 30 50]

After modifying through original:
original: [ 10 999 888  40  50]
view1: [999 888  40]
view2: [ 10 888  50]


#### Advanced Indexing

- Advanced indexing occurs when you use lists or boolean arrays to index into an array.
- **Warning:** Advanced indexing creates copies of the data, and modifying the copy will not affect the original array.

For example:

```python
my_array = np.arange(10)
print('my_array:', my_array)  # [0 1 2 3 4 5 6 7 8 9]

# Using a list for advanced indexing
advanced_indexed_array = my_array[[1, 2, 8]]
print('advanced_indexed_array:', advanced_indexed_array)  # [1 2 8]

# Modify the advanced indexed array
advanced_indexed_array[0] = 99
print('After modifying advanced_indexed_array:')
print('my_array:', my_array)  # [0 1 2 3 4 5 6 7 8 9]
print('advanced_indexed_array:', advanced_indexed_array)  # [99 2 8]
```

Even though `advanced_indexed_array` was created from `my_array`, it is a separate copy, and changes to it do not affect the original `my_array`.


### NumPy Internals

#### Strides

In [11]:
# strides
x = np.array([[1,2,3], 
              [4,5,6]], 
              dtype=np.int32)

print(x.strides)        # bytes to step in each dimension

(12, 4)


The output is (12, 4).

Each element’s size in memory = number of bytes needed to store its value. <br>
`dtype=np.int32` means each element is a **32-bit** integer → **4 bytes** <br>
(32 bits ÷ 8 bits per byte = 4 bytes)

- First number (12) → # of bytes to move **to the next row** (axis 0)
    - each row has 3 elements x 4 bytes = 12 bytes
- Second number (4) → bytes to move **to the next column** (axis 1)
    - each step within a row moves by 1 element → 4 bytes


##### Intuition:

Think of the array as **flattened memory**:

`[1, 2, 3, 4, 5, 6]  ← all stored in a contiguous memory block`

- from `1 → 2` → +4 bytes (column move)
- from `1 → 4` → +12 bytes (row move)


##### Using `numpy.lib.strides_tricks`

In [None]:
from numpy.lib.stride_tricks import as_strided

# Create a sliding window view (important pattern for convolutions)
# For int64 arrays, each element is 8 bytes
# Element stride of 1 = 8 bytes, for the sliding window

a = np.arange(10)
windows = as_strided(a, shape=(6, 3), strides=(8, 8))
print(windows)

# [[0 1 2]
#  [1 2 3]
#  [2 3 4]
#  [3 4 5]
#  [4 5 6]
#  [5 6 7]]

[[0 1 2]
 [1 2 3]
 [2 3 4]
 [3 4 5]
 [4 5 6]
 [5 6 7]]


Each row is one "window" into our original array.