In [1]:
import numpy as np

__Exercise__: Create a small 5×5 array to represent a simple image (maybe a diagonal line of 255s on a background of 0s). Print out the array and verify the positions of high values form the diagonal. This simulates creating a simple image pattern with numpy.

In [2]:
a = np.diag([255] * 5)

In [3]:
a

array([[255,   0,   0,   0,   0],
       [  0, 255,   0,   0,   0],
       [  0,   0, 255,   0,   0],
       [  0,   0,   0, 255,   0],
       [  0,   0,   0,   0, 255]])

In [4]:
# Verify positions of 255

for i in range(a.shape[0]):
    if a[i, i] == 255:
        print(f'255 found at position {i}, {i}')

255 found at position 0, 0
255 found at position 1, 1
255 found at position 2, 2
255 found at position 3, 3
255 found at position 4, 4


---

In [5]:
b = np.zeros((5, 5), dtype=int)
np.fill_diagonal(b, 255)        # modifies the input array in-place, it does not return a value
b

array([[255,   0,   0,   0,   0],
       [  0, 255,   0,   0,   0],
       [  0,   0, 255,   0,   0],
       [  0,   0,   0, 255,   0],
       [  0,   0,   0,   0, 255]])

---

__NumPy arrays__ are `homogeneous`, meaning `every element must be the same type`.

---

In [6]:
students = np.array([
    [1, 85, 78],
    [2, 90, 88],
    [3, 75, 85]
], dtype=np.int32)

students

array([[ 1, 85, 78],
       [ 2, 90, 88],
       [ 3, 75, 85]])

In [7]:
students[:, 1].mean()   # mean of second column (Math scores)

83.33333333333333

---

__Real-World Example (Loading CSV data):__ You might have a CSV file with rows of data. While the pandas library is often used for tabular data, NumPy can also load simple numeric data. For instance, if data.csv contains:

In [8]:
# height,weight,age
# 170,65,25
# 160,50,30
# 180,80,22

We can load it with NumPy (using `genfromtxt` or `loadtxt`):

In [9]:
data = np.loadtxt('data.csv', delimiter=',', skiprows=1)
data

array([[170.,  65.,  25.],
       [160.,  50.,  30.],
       [180.,  80.,  22.]])

In [10]:
# Notice by default loadtxt gave floats; we can specify dtype=int if we want integers.

---
---

### Key Linear Algebra Concepts in Machine Learning
---

#### 1. Vectors

Description: In mathematics, __a vector__ is an ordered list of numbers. Geometrically, you can think of a vector as a point in space (like a coordinate) or an arrow from the origin to that point. For example, [3, 5] in 2D represents a point 3 units along the x-axis and 5 units along the y-axis. Vectors have a _magnitude_ (length) and _direction_. In linear algebra, vectors are often written as column vectors (like a column of numbers), but in NumPy we usually use 1D arrays to represent them.

In data science, a vector is a convenient way to represent a single data instance or a set of features. __Feature vector__ - a list of features describing one sample. For example, if we have a patient with [height, weight, age], that’s a feature vector in 3-dimensional space. Vectors are used to represent words in NLP (word embeddings), pixel values of an image (flattened into one long vector), or a time-series of sensor readings, etc.

In [11]:
# create a vector (as a 1D numpy array)

v = np.array([2, 5, 1])
w = np.array([3, 4, 1])

print('Vector v:', v)
print('Vector w:', w)
print('Shape of v:', v.shape)

Vector v: [2 5 1]
Vector w: [3 4 1]
Shape of v: (3,)


In [14]:
# addition - add corresponding elements

print("v + w =", v + w)

v + w = [5 9 2]


In [16]:
# subtraction

print("v - w =", v - w)

v - w = [-1  1  0]


NumPy will perform __element-wise__ addition/subtraction automatically since v and w have the same shape.

In [17]:
# Scalar Multiplication: Multiply each element by a number.

print("2 * v =", 2 * v)

2 * v = [ 4 10  2]


In [19]:
# Magnitude (Length): ||v|| = sqrt(v_1^2 + v_2^2 + ... ).

mag_v = np.linalg.norm(v)    # Euclidean norm (length) of v

print("||v|| =", mag_v)

||v|| = 5.477225575051661


In [20]:
# manually it can be done

np.sqrt((v**2).sum())

5.477225575051661

---
---

#### 2. Matrices 

In linear algebra, matrices are used to solve systems of linear equations, to represent linear transformations (like rotating or scaling coordinates), and much more.

In [21]:
# create a 2 x 3 matrix
M = np.array([[1, 2, 3],
             [4, 5, 6]])

In [22]:
# first row
M[0, :]

array([1, 2, 3])

In [23]:
# 3rd column
M[:, 2]

array([3, 6])

We can do __operations__ on matrices __element-wise__ similar to vectors (addition, subtraction, scalar multiply, etc., __as long as shapes align__ or 
__via broadcasting__).

In [24]:
M * 2

array([[ 2,  4,  6],
       [ 8, 10, 12]])

In [25]:
N = np.array([[7, 8, 9],
             [1, 2, 3]])

M + N

array([[ 8, 10, 12],
       [ 5,  7,  9]])

---

__Use in Data/ML__:

As mentioned, treating the whole dataset as a matrix allows vectorized computations. For instance, if X is an (N×M) matrix of data and w is an (M×1) weight vector, then X @ w yields an N×1 vector of predictions (one per data point). This is how we express making predictions for multiple data points in one go.

----
----

#### 3. Matrix Multiplication

This is __not__ done element-wise, but follows a specific rule: if A is of shape (p×q) and B is of shape (q×r), then their product C = A × B is of shape (p×r). Each element of C is computed by taking a row of A and a column of B and computing their dot product (multiply corresponding elements and sum them up).

For matrix multiplication to be valid, the __inner dimensions__ must match (the number of columns of the first matrix must equal the number of rows of the second matrix). If you have incompatible shapes, you cannot multiply them in the standard linear algebra sense.

(Note: pay attention to order – matrix multiplication is _not commutative_, meaning 
AB ≠ BA in general.)

__Python/NumPy Example__: We can use `np.dot()` or the `@` operator to do matrix multiplication in NumPy.

In [26]:
A = np.array([[1, 2, 3],
             [4, 5, 6]])   # shape (2, 3)
B = np.array([[7, 8],
             [9, 10],
             [11, 12]])    # shape (3, 2)

In [28]:
C = A.dot(B)
C

array([[ 58,  64],
       [139, 154]])

In [29]:
A @ B

array([[ 58,  64],
       [139, 154]])

In [30]:
np.dot(A, B)

array([[ 58,  64],
       [139, 154]])

---

__Use in Data/ML__: Matrix multiplication is everywhere in machine learning:

- __Linear Regression/Linear Models__: If X is your data matrix (N samples × M features) and β is a parameter vector (M × 1), then predictions 
y^ for all N samples can be computed as the matrix product X × β (result is N × 1). This is essentially performing N dot-products (one for each sample).
- __Neural Networks__: The computation in each layer of a neural network is often a matrix multiply: if you have an input vector, it’s multiplied by a weight matrix to produce an output vector for the next layer. When you process multiple inputs at once (batch processing), you actually use matrix multiplication between a batch matrix and the weight matrix.
- __Word Embeddings__: In NLP, if you represent the vocabulary as vectors (one-hot encodings), multiplying a one-hot vector (which is mostly zeros and a 1 for the target word index) by an __embedding matrix__ yields the vector for that word. That’s matrix multiplication under the hood: one-hot (1×V) times embedding matrix (V×D) = word embedding (1×D).


---
---

#### 4. Dot Product

Description: The __dot product__ (also called scalar product or inner product) is an operation that takes two vectors of the same length and returns a single number (a scalar). If $a$ and $b$ are vectors $(a_1, a_2, ..., a_n)$ and $(b_1, b_2, ..., b_n)$, then

$$
 a \cdot b = a_1 b_1 + a_2 b_2 + \cdots + a_n b_n. 
$$

It's essentially multiplying corresponding components and summing them. We saw this concept inside matrix multiplication (each entry was a row dot a column). The dot product has a geometric interpretation: $a \cdot b = \|a\|\|b\|\cos\theta$, where $\theta$ is the angle between the two vectors. So if two vectors point in similar directions, their dot product is large (and positive); if they are orthogonal (90° apart), dot product is 0; if they point opposite, dot product is negative.

__Python/NumPy Example__: Dot product of two vectors.

In [31]:
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
dot = np.dot(a, b) # or a.dot(b) or a @ b (for 1D does dot)
dot

32

In [32]:
# We can also confirm this by breaking it down:

elementwise = a * b
elementwise.sum()

32

---

**Use in Data/ML:**
- **Feature Weights:** If you have a feature vector and a weight vector, the prediction of a linear model is a dot product $w \cdot x$ (plus maybe a bias). For instance, in linear regression or in a single neuron of a neural net, you compute weighted sum of inputs – that’s a dot product.
- **Similarity:** In information retrieval or recommender systems, you might compute how similar two users are by taking the dot product of their preference vectors. Cosine similarity between two vectors is basically $\frac{a \cdot b}{\|a\|\|b\|}$. If vectors are normalized to length 1, cosine similarity is exactly the dot product. Word embeddings are often compared via dot product to find similar words ([Linear Algebra Required for Data Science | GeeksforGeeks](https://www.geeksforgeeks.org/linear-algebra-required-for-data-science/#:~:text=%2A%20NLP%20,dot%20products%20alongside%20matrix%20multiplication)).
- **Orthogonality:** As noted, if $a \cdot b = 0$, the vectors are orthogonal (uncorrelated in a sense). In ML, this concept appears in orthogonal feature vectors or orthogonal weight initialization in neural networks, etc., meaning components that capture independent information.
- **Matrix multiplication connection:** When we do $X @ w$ for predictions, each output is a dot product of a data row with the weight vector. So dot product is the elemental operation inside matrix multiplication.


---
---