<a href="https://colab.research.google.com/github/Nunnaphanindra/Library-Managment-System/blob/master/numpy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***NumPy:*** is the foundation for numerical computing in Python, so building mastery here pays off in data science, ML, scientific computing, and optimization tasks.

***Difference between Python lists and NumPy arrays:***
**1. Data type:**
Python List → can store elements of different types (e.g., int, float, string in the same list).

NumPy Array → stores elements of the same type (all ints, or all floats).


In [None]:
import numpy as np

py_list = [1, "hello", 3.5]     # mixed types allowed
np_array = np.array([1, 2, 3])  # only numbers, same type
print("Python List:", py_list)
print("NumPy Array:", np_array)


Python List: [1, 'hello', 3.5]
NumPy Array: [1 2 3]


**🔹 2. Memory Efficiency**

**List** → each element is a full Python object → uses more memory.

NumPy Array **bold text** → stores data in contiguous blocks of memory, making it much more compact.

In [None]:
import sys

lst = [1,2,3,4,5]
arr = np.array([1,2,3,4,5])

print("List size:", sys.getsizeof(lst))
print("NumPy array size:", arr.nbytes)


List size: 104
NumPy array size: 40


**Speed**

**List** → slower for numerical operations (loops in Python).

**NumPy Array** → very fast because it uses C-optimized vectorized operations.

In [None]:
lst = [1,2,3,4,5]
arr = np.array([1,2,3,4,5])

# Multiply each element by 2
print([x*2 for x in lst])   # Python list → loop
print(arr * 2)              # NumPy array → vectorized, super fast


[2, 4, 6, 8, 10]
[ 2  4  6  8 10]


**Functionality**

**List **→ general-purpose container.

**NumPy Array** → specialized for mathematical, statistical, and matrix operations.

**operations:**
**List** → supports basic operations (append, insert, remove, slicing).

**NumPy Array** → supports mathematical & matrix operations (addition, multiplication, dot product, transpose, etc.).


In [None]:
import numpy as np

arr1 = np.array([1,2,3])
arr2 = np.array([4,5,6])

print(arr1 + arr2)   # element-wise addition
print(arr1 * arr2)   # element-wise multiplication
print(np.dot(arr1, arr2))   # dot product

# Transpose (2D case to see effect)
arr1_2d = np.array([[1,2,3]])
print(arr1_2d.T)     # transpose


[5 7 9]
[ 4 10 18]
32
[[1]
 [2]
 [3]]


**Creating arrays:**


In [None]:
import numpy as np
arr = np.array([1,2,3])
print(arr)


[1 2 3]


**np.zeros((3,3)), np.ones((2,2))**

**np.zeros((m,n))** → matrix filled with 0s.

**np.ones((m,n))** → matrix filled with 1s.

In [None]:
print(np.zeros((3,3)))
print(np.ones((2,2)))


[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 1.]
 [1. 1.]]


**np.arange():**

Works like Python’s range(), generates numbers in a range.

In [None]:
print(np.arange(5))        # 0 to 4
print(np.arange(2,10,2))   # start=2, stop=10, step=2


[0 1 2 3 4]
[2 4 6 8]


**np.eye() **→ Identity matrix

Square matrix with 1s on the diagonal and 0s elsewhere.

In [None]:
print(np.eye(3))


[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


**np.diag()**

Creates a diagonal matrix from a list.

Or extracts the diagonal of a matrix.

In [None]:
print(np.diag([1,2,3]))   # create diagonal matrix
print(np.diag(np.array([[1,2],[3,4]])))  # extract diagonal




[[1 0 0]
 [0 2 0]
 [0 0 3]]
[1 4]


***4. np.linspace()***

Generates numbers evenly spaced between start and stop.
Syntax: np.linspace(start, stop, num_points)

In [None]:
print(np.linspace(0,5,2))

[0. 5.]


**Array Attributes**:

***.shape***

Tells you the dimensions of the array (rows, columns, etc.)

In [None]:
import numpy as np
arr=np.array([[1,2,3],[4,5,6]])
print(arr.shape)

(2, 3)


***.ndim***

Gives the number of dimensions.

In [None]:
print(arr.ndim)    # 2 (2D array)


2


***.dtype***

Shows the data type of elements.

In [None]:
print(arr.dtype)   # int64 (depends on your system)


int64


*.size*

Total number of elements.

In [None]:
print(arr.size)

6


***.itemsize***

Size in bytes of each element.

In [None]:
print(arr.itemsize)   # e.g. 8 bytes for int64


8


***Reshaping Arrays:***

***.reshape()***
Changes the shape of an array without changing data.

In [None]:
arr=np.arange(6)
print(arr.reshape(2,3))


[[0 1 2]
 [3 4 5]]


***.ravel()***

Returns a flattened (1D) view of the array (changes original if modified).

In [None]:
arr2d=np.array([[1,2,3],[4,5,6]])
flat=arr2d.ravel()
print(flat)

[1 2 3 4 5 6]


***.flatten()***

Also flattens the array, but returns a copy (not a view).

In [None]:
flat_copy=arr2d.flatten()
print(flat_copy)

[1 2 3 4 5 6]


**Difference between .ravel() and .flatten()**

**.ravel() **→ view (changes in flat array affect original)

.**flatten()** → copy (independent of original)

In [None]:
arr = np.array([[1,2],[3,4]])
r = arr.ravel()
f = arr.flatten()

r[0] = 99
print(arr)   # [[99  2]  [ 3  4]] → original changed

f[0] = 77
print(arr)   # [[99  2]  [ 3  4]] → original unchanged


[[99  2]
 [ 3  4]]
[[99  2]
 [ 3  4]]


**Notice:**
.shape, .ndim, .dtype, .size, .itemsize → info about array

.reshape() → change shape

.ravel(), .flatten() → convert to 1D

***Indexing & Slicing***:

🔹 1. Indexing
(a) 1D Indexing

In [None]:
import numpy as np
arr=np.array([10,20,30,40,50])
print(arr[0])
print(arr[3])

10
40


***(b) 2D Indexing***:

In [None]:
arr2d=np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12]])
print(arr2d[0])


[1 2 3]


2. Slicing
(a) 1D Slicing:

In [None]:
print(arr[1:4])


[20 30 40]


**(b) 2D Slicing:**

In [None]:
print(arr2d[0:2, 1:3])


[[2 3]
 [5 6]]


**3.Boolean Indexing**

**Select elements that satisfy a condition:**

In [None]:
print(arr[arr > 25])


[30 40 50]


**4. Fancy Indexing**

**Select multiple specific indices at once.**

In [None]:
print(arr[[0, 2, 4]])



[10 30 50]


***Works with 2D too***

In [None]:
print(arr2d[[0,2],[1,2]])

[2 9]


**2. Core Operations (Intermediate)**

**1. Element-wise Operations**

**Operations are applied to each element**.

In [None]:
import numpy as np

arr1 = np.array([1, 2, 3, 4])
arr2 = np.array([5, 6, 7, 8])

print(arr1 + arr2)
print(arr1 - arr2)
print(arr1 * arr2)
print(arr2 / arr1)
print(arr1 ** 2)


[ 6  8 10 12]
[-4 -4 -4 -4]
[ 5 12 21 32]
[5.         3.         2.33333333 2.        ]
[ 1  4  9 16]


**2. Universal Functions (ufuncs)**

**Pre-built fast functions applied element-wise.**

In [None]:
arr=np.array([1,2,3,4,5])
print(np.sqrt(arr))
print(np.exp(arr))
print(np.log(arr))

[1.         1.41421356 1.73205081 2.         2.23606798]
[  2.71828183   7.3890561   20.08553692  54.59815003 148.4131591 ]
[0.         0.69314718 1.09861229 1.38629436 1.60943791]


**3. Aggregations**

***Work on entire array (or along axis in 2D).***

In [None]:
arr = np.array([1, 2, 3, 4, 5])

print(np.sum(arr))
print(np.mean(arr))
print(np.std(arr))
print(np.var(arr))


15
3.0
1.4142135623730951
2.0


4***. Min / Max Functions***

In [None]:
arr = np.array([10, 20, 5, 40, 15])

print(np.min(arr))
print(np.max(arr))
print(np.argmin(arr))
print(np.argmax(arr))


5
40
2
3


***What is Broadcasting?***

Broadcasting means NumPy automatically expands arrays of different shapes so arithmetic operations can work without making explicit copies.
It follows broadcasting rules:

If arrays have different number of dimensions, the smaller one is padded with 1s on the left.

If the shape dimensions are not equal, one of them must be 1 → then it can be stretched (broadcasted).

If dimensions are not equal and neither is 1 → ❌ error.

***Example 1: Scalar + Matrix***

A scalar is "broadcasted" to every element.

In [None]:
import numpy as np

arr = np.array([[1, 2, 3],
                [4, 5, 6]])

print(arr + 10)


[[11 12 13]
 [14 15 16]]


*Example 2: Vector + Matrix (Row-wise broadcasting)*

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

vec = np.array([10, 20, 30])

print(arr + vec)


[[11 22 33]
 [14 25 36]]


***Example 3: Column Vector + Matrix (Column-wise broadcasting)***

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

col_vec = np.array([[10],
                    [20]])

print(arr + col_vec)


[[11 12 13]
 [24 25 26]]


***Example 4: Mismatched shapes (Error case)***

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]])

bad_vec = np.array([10, 20])  # shape mismatch

print(arr + bad_vec)  # ❌ Error


ValueError: operands could not be broadcast together with shapes (2,3) (2,) 

***1. np.random.rand()***

Generates random numbers between 0 and 1 (uniform distribution).
Takes dimensions as arguments

In [None]:
import numpy as np

print(np.random.rand())       # single random number
print(np.random.rand(2, 3))   # 2x3 matrix of random numbers


0.6871874132968593
[[0.40163793 0.71563888 0.18524886]
 [0.76398553 0.83444703 0.6284558 ]]


***2. np.random.randn()***

Generates random numbers from a standard normal distribution (mean=0, std=1).
Useful for simulations/statistics.

In [None]:
print(np.random.randn())
print(np.random.randn(2, 3))


0.1461009480964727
[[-0.58275786 -0.8888242  -1.43270608]
 [ 1.727902    0.23562496 -1.30785388]]


***3. np.random.randint()***

Generates random integers within a given range.

In [None]:
print(np.random.randint(1, 10))        # one integer between 1 and 9
print(np.random.randint(1, 10, 5))     # 5 random integers
print(np.random.randint(10, 50, (2,3)))  # 2x3 matrix of random integers


5
[2 7 7 7 9]
[[18 48 23]
 [41 21 20]]


***4. Seeding with np.random.seed()***

👉 Random numbers are generated differently each run.
👉 If you want reproducibility, set a seed.

***Without a seed:***

   *** You’ll get different numbers every time (because they are random).***

In [None]:
import numpy as np
print(np.random.rand(3))


[0.8799965  0.25740443 0.07914931]


**With a seed:**
**You always get the same result.**

In [None]:
np.random.seed(42)
print(np.random.rand(3))


[0.37454012 0.95071431 0.73199394]


In [None]:
np.random.seed(42)

print(np.random.rand(3))   # always same output if seed=42


[0.37454012 0.95071431 0.73199394]


***Linear Algebra Basics***

***1. Dot Product (np.dot())***

**Dot product of two vectors:**

In [None]:
import numpy as np

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])

print("Dot Product:", np.dot(a, b))


Dot Product: 32


***2. Matrix Multiplication (@ or np.matmul())***

In [None]:
A = np.array([[1, 2],                                                                 Calculation:
              [3, 4]])
B = np.array([[5, 6],
              [7, 8]])

print("Matrix Multiplication (using @):\n", A @ B)
print("Matrix Multiplication (using matmul):\n", np.matmul(A, B))


Matrix Multiplication (using @):
 [[19 22]
 [43 50]]
Matrix Multiplication (using matmul):
 [[19 22]
 [43 50]]


Calculation:

Row1·Col1 = 1*5 + 2*7 = 19

Row1·Col2 = 1*6 + 2*8 = 22

Row2·Col1 = 3*5 + 4*7 = 43

Row2·Col2 = 3*6 + 4*8 = 50

***3. Transpose (.T):***

In [None]:
print("Original A:\n", A)
print("Transpose of A:\n", A.T)


Original A:
 [[1 2]
 [3 4]]
Transpose of A:
 [[1 3]
 [2 4]]


***4. Determinant (np.linalg.det()):***
The determinant is a single number that can be calculated from a square matrix (2×2, 3×3, …).
It tells us things like whether the matrix is invertible or not.

1. Determinant of a 2×2 matrix

Formula:
det(A)=(1∗4)−(2∗3)=4−6=−2

In [None]:
import numpy as np

A = np.array([[1, 2],
              [3, 4]])

det_A = np.linalg.det(A)
print("Determinant of A:", det_A)


Determinant of A: -2.0000000000000004


***5. Inverse (np.linalg.inv()***)

In [None]:
inv_A = np.linalg.inv(A)
print("Inverse of A:\n", inv_A)


Inverse of A:
 [[-2.   1. ]
 [ 1.5 -0.5]]


***6. Eigenvalues & Eigenvectors (np.linalg.eig()):***
What are Eigenvalues & Eigenvectors?

For a square matrix
𝐴
A, an eigenvector
𝑣
v and an eigenvalue
𝜆
λ satisfy:
  Av=λv
  Meaning: when the matrix
𝐴
A acts on vector
𝑣
v, the result is just a scaled version of
𝑣
v (not rotated).

𝜆
λ = eigenvalue (scalar, tells the stretching factor)

𝑣
v = eigenvector (direction that doesn’t change, only stretched)

In [None]:
import numpy as np

A = np.array([[4, -2],
              [1,  1]])

eigenvalues, eigenvectors = np.linalg.eig(A)

print("Matrix A:\n", A)
print("\nEigenvalues:\n", eigenvalues)
print("\nEigenvectors:\n", eigenvectors)


Matrix A:
 [[ 4 -2]
 [ 1  1]]

Eigenvalues:
 [3. 2.]

Eigenvectors:
 [[0.89442719 0.70710678]
 [0.4472136  0.70710678]]


***1. Stacking Arrays***

Stacking means joining arrays together.

Vertical Stacking (np.vstack())

Stacks arrays row-wise (one on top of the other)

In [None]:
import numpy as np

a = np.array([1, 2, 3])   # shape (3,)
b = np.array([4, 5, 6])   # shape (3,)

print("Vertical Stack:\n", np.vstack([a, b]))


Vertical Stack:
 [[1 2 3]
 [4 5 6]]


***Horizontal Stacking (np.hstack())***

Stacks arrays column-wise (side by side).

In [None]:
print("Horizontal Stack:\n", np.hstack([a, b]))


Horizontal Stack:
 [1 2 3 4 5 6]


***2. Splitting Arrays***

Splitting means dividing an array into smaller sub-arrays.

Equal Splits (np.split())

In [None]:
x=np.array([10,20,30,40,50,60])
print(np.split(x,3))

[array([10, 20]), array([30, 40]), array([50, 60])]


***Unequal Splits (np.array_split()):***

 Unlike np.split(), it allows uneven splitting.

In [None]:
import numpy as np
x=np.array([10,20,30,40,50,60])
print(np.array_split(x,4))

[array([10, 20]), array([30, 40]), array([50]), array([60])]


In [None]:
#code 2:
import numpy as np
x=np.array([10,20,30,40,50,60])
print(np.array_split(x,3))

[array([10, 20]), array([30, 40]), array([50, 60])]


**Summary**

**np.vstack() → stack vertically (row-wise)**

**np.hstack() → stack horizontally (column-wise)**

**np.split() → splits equally (error if not possible)**

**np.array_split() → allows uneven splits**

***1. Splitting a 2D Array Row-wise (by rows)***

In [None]:
import numpy as np

arr = np.array([[1, 2, 3, 4],
                [5, 6, 7, 8],
                [9,10,11,12],
                [13,14,15,16]])

print("Original Array:\n", arr)

# Split into 2 equal parts row-wise
rows = np.split(arr, 2, axis=0)
print("\nRow-wise split into 2 parts:")
for r in rows:
    print(r)


Original Array:
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]
 [13 14 15 16]]

Row-wise split into 2 parts:
[[1 2 3 4]
 [5 6 7 8]]
[[ 9 10 11 12]
 [13 14 15 16]]


**2. Splitting a 2D Array Column-wise (by columns)**

In [None]:
# Split into 2 equal parts column-wise
cols = np.split(arr, 2, axis=1)
print("\nColumn-wise split into 2 parts:")
for c in cols:
    print(c)



Column-wise split into 2 parts:
[[ 1  2]
 [ 5  6]
 [ 9 10]
 [13 14]]
[[ 3  4]
 [ 7  8]
 [11 12]
 [15 16]]


***3. Unequal Splits with np.array_split()***

If equal split is not possible, use np.array_split()

In [None]:
# Split row-wise into 3 uneven parts
rows_uneven = np.array_split(arr, 3, axis=0)
print("\nRow-wise split into 3 uneven parts:")
for r in rows_uneven:
    print(r)



Row-wise split into 3 uneven parts:
[[1 2 3 4]
 [5 6 7 8]]
[[ 9 10 11 12]]
[[13 14 15 16]]


Summary

axis=0 → row-wise split

axis=1 → column-wise split

np.split() → equal splits only

np.array_split() → allows uneven splits

***Sorting & Searching:***

***1. Sorting with np.sort()***

👉 Returns a sorted copy of the array.

In [None]:
import numpy as np
arr=np.array([5,2,9,1,7])
print(arr)
print("sorted:",np.sort(arr))
print(arr)

[5 2 9 1 7]
sorted: [1 2 5 7 9]
[5 2 9 1 7]


***2. Indices of sorted array with np.argsort()***

👉 Returns indices that would sort the array.

arr[np.argsort(arr)]

Here we use fancy indexing.

np.argsort(arr) = [1, 3, 2, 0]

So arr[[1, 3, 2, 0]] means:

arr[1] = 10

arr[3] = 20

arr[2] = 30

arr[0] = 40

In [None]:
print("Argsort:", np.argsort(arr))
print("Sorted using indices:", arr[np.argsort(arr)])


Argsort: [3 1 0 4 2]
Sorted using indices: [1 2 5 7 9]


**3. Searching with np.where()**

👉 Returns indices where condition is True.

In [None]:
arr = np.array([10, 20, 30, 40, 50])
print("Where > 25:", np.where(arr > 25))
print("Values > 25:", arr[np.where(arr > 25)])


Where > 25: (array([2, 3, 4]),)
Values > 25: [30 40 50]


**4. Finding non-zero elements with np.nonzero()**

👉 Useful for sparse arrays

In [None]:
arr = np.array([0, 2, 0, 5, 0, 7])
print("Non-zero indices:", np.nonzero(arr))
print("Non-zero values:", arr[np.nonzero(arr)])


Non-zero indices: (array([1, 3, 5]),)
Non-zero values: [2 5 7]


**5. Unique elements with np.unique()**

👉 Finds unique values (optionally return counts and indices)

In [None]:
arr=np.array([1,2,2,3,3,3,3,4,4])
print(np.unique(arr))
unique,counts=np.unique(arr,return_counts=True)
print(unique)
print(counts)

[1 2 3 4]
[1 2 3 4]
[1 2 4 2]


Summary

np.sort() → sorted array (copy)

np.argsort() → indices of sorted order

np.where() → indices matching condition

np.nonzero() → indices of non-zero elements

np.unique() → distinct values (+ counts if needed)

**Advanced Indexing**

**Index with arrays & conditions**

**Masking & filtering data**

***1. Indexing with Arrays (Fancy Indexing)***

👉 You can pass a list/array of indices to access multiple elements at once.

In [None]:
import numpy as np
arr=np.array([10,20,30,40,50,60])
print(arr)
print(arr[[0,2,4]])

[10 20 30 40 50 60]
[10 30 50]


***2. Boolean Indexing (Masking)***

👉 Create a Boolean mask and filter values.

In [None]:
arr=np.array([15,10,20,25,30,40])
print(arr)
mask=arr>20
print(mask)
print(arr[mask])

[15 10 20 25 30 40]
[False False False  True  True  True]
[25 30 40]


***3. Indexing in 2D Arrays with Arrays***

In [None]:
arr2d=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(arr2d)
rows = [0, 1, 2]
cols = [1, 2, 0]
print(arr2d[rows,cols])


[[1 2 3]
 [4 5 6]
 [7 8 9]]
[2 6 7]


***4. Masking in 2D Arrays***

In [None]:
print("Values > 4:\n", arr2d[arr2d > 4])


Values > 4:
 [5 6 7 8 9]


***5. Combining Conditions***

👉 Use logical operators (&, |, ~) for multiple conditions.

In [None]:
import numpy as np
arr=np.array([10,20,25,30,35,40])
print(arr)
print(arr[(arr>=20) & (arr<=30)])
print(arr[(arr<20) | (arr>40)])
print(arr[~(arr>30)])

[10 20 25 30 35 40]
[20 25 30]
[10]
[10 20 25 30]


***Summary:***

& → AND (both conditions must be True)

| → OR (at least one condition True)

~ → NOT (flips True ↔ False)

***Handling Missing Values:***
 np.isnan(), np.nan_to_num() Aggregations with NaN: np.nanmean(), np.nanstd()

**Detect Missing Values → np.isnan()**

👉 Checks if a value is NaN (Not a Number).

In [None]:
import numpy as np
arr=np.array([1,2,np.nan,4,np.nan])
print(arr)
print(np.isnan(arr))

[ 1.  2. nan  4. nan]
[False False  True False  True]


***Replace Missing Values → np.nan_to_num()***

👉 Replaces NaN with 0 (or any chosen value).

In [None]:
print(np.nan_to_num(arr))
print(np.nan_to_num(arr,nan=-1))

[1. 2. 0. 4. 0.]
[ 1.  2. -1.  4. -1.]


***Aggregations that Ignore NaN***

👉 Normal np.mean(), np.std() break if NaN exists.
👉 Use NaN-safe versions:

np.nanmean() → mean ignoring NaN

np.nanstd() → standard deviation ignoring NaN

In [None]:
print("Normal mean:", np.mean(arr))
print("NaN-safe mean:", np.nanmean(arr))
print("NaN-safe std:", np.nanstd(arr))

Normal mean: nan
NaN-safe mean: 2.3333333333333335
NaN-safe std: 1.247219128924647


***Summary:***

np.isnan(arr) → finds missing values (True where NaN).

np.nan_to_num(arr, nan=value) → replaces NaN with 0 or custom value.

np.nanmean(), np.nanstd(), np.nanmin(), np.nanmax() → safely compute ignoring NaN.

 **Broadcasting in Depth**
 **Shape alignment rules**
 **Efficient vectorization instead of loops**

***1. Broadcasting Rules (Shape Alignment)***

Broadcasting works if from right to left, dimensions are either equal or one of them is 1.

Example cases:
| Shape A | Shape B | Works? | Result shape |
| ------- | ------- | ------ | ------------ |
| (3,)    | (3,)    | ✅ Yes  | (3,)         |
| (3,1)   | (1,3)   | ✅ Yes  | (3,3)        |
| (4,3)   | (3,)    | ✅ Yes  | (4,3)        |
| (2,3)   | (2,1)   | ✅ Yes  | (2,3)        |
| (2,3)   | (4,3)   | ❌ No   | —            |


***2. Adding a Scalar to an Array***

Scalar is treated as if it has the same shape as the array.

In [None]:
arr=np.array([1,2,3])
print(arr+10)

[11 12 13]


***3. Vector to Matrix (Row-wise Broadcasting)***

In [None]:
mat=np.array([[1,2,3],
              [4,5,6]])
vec=np.array([10,20,30])
print(mat+vec)

[[11 22 33]
 [14 25 36]]


***4. Column-wise Broadcasting (Using Reshape)***

In [None]:
import numpy as np

mat = np.array([[1, 2, 3],
                [4, 5, 6]])
vec_col=np.array([10,20]).reshape(2,1)
print(vec_col)
print(mat+vec_col)

[[10]
 [20]]
[[11 12 13]
 [24 25 26]]


***5. Efficient Vectorization (Replacing Loops)***

Instead of looping element by element, NumPy broadcasts automatically.

**With Loop (slow ❌):**

In [None]:
result = []
for i in [1,2,3]:
    result.append(i * 10)
print("With loop:", result)


With loop: [10, 20, 30]


**With Broadcasting (fast ✅):**

In [None]:
arr = np.array([1,2,3])
print("With broadcasting:", arr * 10)


With broadcasting: [10 20 30]


NumPy is optimized in C, so broadcasting is far faster than Python loops.

***Summary***

Broadcasting lets arrays with different shapes interact.

Rules: dimensions must match or be 1.

Works for scalar, row, column operations.

Avoids slow Python loops → vectorized operations.

***Memory & Performance Views vs copies (arr.view(), arr.copy()) In-place operations for performance Strides: np.lib.stride_tricks.***

**1. Views vs Copies**

👉 NumPy tries to avoid copying data for performance reasons.

.view() → creates a view (no new data, just a new "window" into the same memory).

.copy() → creates a deep copy (new memory, changes won’t affect original).

In [None]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

# Create a view
view_arr = arr.view()
view_arr[0] = 99   # modifies original too!

# Create a copy
copy_arr = arr.copy()
copy_arr[1] = 100  # does NOT affect original

print("Original:", arr)
print("View:", view_arr)
print("Copy:", copy_arr)


Original: [99  2  3  4  5]
View: [99  2  3  4  5]
Copy: [ 99 100   3   4   5]


**2. In-place Operations for Performance**

Instead of creating new arrays, you can update existing arrays in-place.
This saves memory and improves performance.

In [None]:
arr = np.array([1, 2, 3, 4, 5])

# Regular operation (creates a new array in memory)
new_arr = arr * 2
print("New Array:", new_arr)

# In-place operation (modifies arr directly)
arr *= 2
print("In-place Modified Array:", arr)


New Array: [ 2  4  6  8 10]
In-place Modified Array: [ 2  4  6  8 10]


***3. Strides (Advanced Memory Layout)***

👉 Strides tell NumPy how many bytes to step in memory when moving along each axis.
This is what allows slicing without copying data.

In [None]:
arr = np.array([[1, 2, 3],
                [4, 5, 6]], dtype=np.int32)

print("Array:\n", arr)
print("Shape:", arr.shape)
print("Strides:", arr.strides)


Array:
 [[1 2 3]
 [4 5 6]]
Shape: (2, 3)
Strides: (12, 4)


Explanation:

dtype=np.int32 → each element takes 4 bytes.

Strides (12,4) means:

Move 12 bytes to go to the next row (3 ints × 4 bytes).

Move 4 bytes to go to the next column

**4. Using stride_tricks**

np.lib.stride_tricks lets you create special views without copying data.

In [None]:
from numpy.lib.stride_tricks import sliding_window_view
arr = np.arange(10)
print("Original:", arr)
windows = sliding_window_view(arr, window_shape=3)
print("Sliding windows:\n", windows)
print(windows.shape)


Original: [0 1 2 3 4 5 6 7 8 9]
Sliding windows:
 [[0 1 2]
 [1 2 3]
 [2 3 4]
 [3 4 5]
 [4 5 6]
 [5 6 7]
 [6 7 8]
 [7 8 9]]
(8, 3)


Summary

view() = no copy, same memory

copy() = independent memory

in-place ops = faster, saves memory

strides = how NumPy walks through memory

stride_tricks = advanced way to make overlapping windows / reshapes without copies

Structured Arrays Working with heterogeneous data types Example: employee records with names + ages

1. What are Structured Arrays?

Normally, NumPy arrays are homogeneous (all elements same dtype).
But structured arrays let you store heterogeneous data (like strings, ints, floats in one row).

2. Defining a Structured Array

We define it by specifying a dtype with field names & types.

In [None]:
import numpy as np

# Define structured data type
employee_dtype = np.dtype([('name', 'U10'),   # Unicode string (max 10 chars)
                           ('age', 'i4'),    # 32-bit integer
                           ('salary', 'f8')]) # 64-bit float

# Create structured array
employees = np.array([('Alice', 25, 50000.0),
                      ('Bob',   30, 60000.5),
                      ('Charlie', 28, 55000.75)],
                      dtype=employee_dtype)

print("Employees array:\n", employees)
print("Dtype:", employees.dtype)


Employees array:
 [('Alice', 25, 50000.  ) ('Bob', 30, 60000.5 ) ('Charlie', 28, 55000.75)]
Dtype: [('name', '<U10'), ('age', '<i4'), ('salary', '<f8')]


3. Accessing Fields (Column-wise)

Structured arrays let you access columns by field name.

In [None]:
print("Names:", employees['name'])
print("Ages:", employees['age'])
print("Salaries:", employees['salary'])


Names: ['Alice' 'Bob' 'Charlie']
Ages: [25 30 28]
Salaries: [50000.   60000.5  55000.75]


4. Row-wise Access

Each row is like a "record".

In [None]:
print("First record:", employees[0])
print("Second record name:", employees[1]['name'])
print("Third record salary:", employees[2]['salary'])


First record: ('Alice', 25, 50000.0)
Second record name: Bob
Third record salary: 55000.75


5. Filtering & Conditions

We can filter like with normal arrays:

In [None]:
older=employees[employees['age']>26]
print(older)
high_salary=employees[employees['salary']>55000]
print(high_salary)

[('Bob', 30, 60000.5 ) ('Charlie', 28, 55000.75)]
[('Bob', 30, 60000.5 ) ('Charlie', 28, 55000.75)]


6. Sorting Structured Arrays

We can sort by fields:

In [None]:

sorted_by_age = np.sort(employees, order='age')
print("Sorted by Age:\n", sorted_by_age)

sorted_by_salary = np.sort(employees, order='salary')
print("Sorted by Salary:\n", sorted_by_salary)


7. Updating Values

You can directly update fields:

In [None]:
employees[0]['salary'] = 52000
employees[1]['age'] = 31

print("Updated Employees:\n", employees)


Updated Employees:
 [('Alice', 25, 52000.  ) ('Bob', 31, 60000.5 ) ('Charlie', 28, 55000.75)]


***Summary***

Structured arrays = tables inside NumPy

Fields can have different data types

Access columns by name or rows like records

Support filtering, sorting, updating efficiently

Expert-Level Usage ✅
 Vectorization & Optimization Replace Python loops with NumPy vectorized ops Use np.vectorize()

***1. Why Vectorization?***

Python loops are slow because they run in interpreted mode.

NumPy uses C-optimized code, so operations on whole arrays are much faster.

Instead of looping element by element, we apply the operation on the entire array at once.

**2. Replacing Loops with Vectorized Ops**

👉 Example: Square each element in an array

In [1]:
import numpy as np

arr = np.array([1, 2, 3, 4, 5])

squares_loop = []
for x in arr:
    squares_loop.append(x ** 2)

print("Using loop:", squares_loop)


Using loop: [np.int64(1), np.int64(4), np.int64(9), np.int64(16), np.int64(25)]


3. Vectorized Mathematical Operations

In [2]:
arr = np.array([1, 2, 3, 4, 5])

print("Original:", arr)

# Element-wise operations
print("Squared:", arr ** 2)
print("Reciprocal:", 1 / arr)
print("Sine:", np.sin(arr))
print("Logarithm:", np.log(arr))


Original: [1 2 3 4 5]
Squared: [ 1  4  9 16 25]
Reciprocal: [1.         0.5        0.33333333 0.25       0.2       ]
Sine: [ 0.84147098  0.90929743  0.14112001 -0.7568025  -0.95892427]
Logarithm: [0.         0.69314718 1.09861229 1.38629436 1.60943791]


***4. Vectorization with Conditions (Masking)***

👉 Example: Replace all negative numbers with 0


In [None]:
arr = np.array([10, -3, 7, -1, 5])

# Loop way
result_loop = []
for x in arr:
    if x < 0:
        result_loop.append(0)
    else:
        result_loop.append(x)

print("Loop result:", result_loop)

# Vectorized way
result_vec = np.where(arr < 0, 0, arr)
print("Vectorized result:", result_vec)

***5. Using np.vectorize()***

np.vectorize() is a wrapper that applies a Python function elementwise to arrays.

⚠️ It does NOT make things faster (it’s just for convenience).

In [None]:
def custom_func(x):
    return x**2 if x % 2 == 0 else -x

arr = np.array([1, 2, 3, 4, 5])

# Without vectorize (loop)
loop_res = [custom_func(x) for x in arr]
print("Loop:", loop_res)

# With np.vectorize
vec_func = np.vectorize(custom_func)
vec_res = vec_func(arr)
print("np.vectorize:", vec_res)


6. Big Example: Distance Calculation

👉 Compute Euclidean distance between points.

In [3]:
import numpy as np

# Generate 5 random points in 2D (small example to understand)
points = np.random.rand(5, 2)
print("Points:\n", points)

# ---------------- Loop version ----------------
dist_loop = []
for i in range(len(points)):
    d = np.sqrt(points[i,0]**2 + points[i,1]**2)
    dist_loop.append(d)
print("\nDistances (Loop):", dist_loop)

# ---------------- Vectorized version ----------------
dist_vec = np.sqrt(np.sum(points**2, axis=1))
print("Distances (Vectorized):", dist_vec)

# ---------------- Check if same ----------------
print("\nAre both methods equal? ->", np.allclose(dist_loop, dist_vec))


Points:
 [[0.46343852 0.15050248]
 [0.50153172 0.24891681]
 [0.52022936 0.15513233]
 [0.01051633 0.84905702]
 [0.78499049 0.43692497]]

Distances (Loop): [np.float64(0.4872640558821775), np.float64(0.5599050332195719), np.float64(0.5428670426941187), np.float64(0.849122143683896), np.float64(0.8983949523843334)]
Distances (Vectorized): [0.48726406 0.55990503 0.54286704 0.84912214 0.89839495]

Are both methods equal? -> True


***Summary***

Vectorization = replace Python loops with NumPy operations.

np.vectorize() = convenience wrapper (not true optimization).

Use ufuncs (np.sin, np.exp, np.log) and masking (np.where) for efficient code.

Huge performance boost for large datasets.