### Matrix Addition

#### Set-1

In [None]:
'''
SET–1 : Matrix Addition (Foundations & Intuition)

Q1. What is matrix addition in simple words?
Ans. Matrix addition means adding corresponding elements of two matrices.
'''
# Example
A = [
    [1, 2],
    [3, 4]
]
B = [
    [5, 6],
    [7, 8]
]
# A + B = [[6, 8], [10, 12]]


'''
Q2. What is the most important rule for matrix addition?
Ans.
Two matrices can be added only if they have the same shape,
which means the same number of rows AND the same number of columns.

'''
# Example 1 (Allowed)
# Shape of A = (2, 2)
# Shape of B = (2, 2)
# Addition is allowed

# Example 2 (Not Allowed)
# Shape of A = (2, 2)
# Shape of B = (1, 2)
# Addition is NOT allowed
# Reason: Number of rows is different

# Key Rule (Very Important)
# Same rows + same columns → Addition possible
# If either rows or columns differ → Addition NOT possible




'''
Q3. What happens if matrix shapes do not match?
Ans. Matrix addition is not defined if shapes are different.
'''
# Example
A = [[1, 2, 3]]      # shape (1, 3)
B = [[4, 5], [6, 7]] # shape (2, 2)
# A + B → ❌ not allowed


'''
Q4. How does matrix addition work element-wise?
Ans. Each element at position (i, j) is added with the element at the same position.
'''
# Example
# (A + B)[0][1] = A[0][1] + B[0][1]
# 2 + 6 = 8


'''
Q5. How can matrix addition be understood using row vectors?
Ans. Matrix addition adds corresponding row vectors one by one.
'''
# Example
row1_A = [1, 2]
row1_B = [5, 6]
# row1_A + row1_B = [6, 8]


q='''
Q6. Why is matrix addition useful in AI?
Ans. It is used to add bias, combine signals, and build residual connections.
'''
# Example
# Z = XW + b
# b is added element-wise to every row of XW


#### Set-2

In [None]:
'''
SET–2 : Matrix Addition (AI Usage & Practical Meaning)

Q1. How is matrix addition used to add bias in neural networks?
Ans. Bias is added element-wise to every row of the output matrix.
'''
# Example
# XW shape = (100, 10)
# b shape  = (1, 10)
# XW + b → bias added to each sample

# This appears to be an error at first glance, but it is actually correct due to broadcasting.
# In numpy and many other numerical computing libraries, when you add two arrays of different shapes,
# the smaller array is "broadcast" across the larger array so that they have compatible shapes for
# element-wise operations.
# Here, the bias vector b with shape (1, 10) is broadcasted to match the shape of XW (100, 10).
# This means that b is effectively treated as if it were repeated 100 times (once for each row of XW),
# allowing for the addition to occur element-wise across all rows.


'''
Q2. Why must bias have compatible shape for matrix addition?
Ans. Because matrix addition requires matching shapes (or broadcastable ones).
'''
# Example
# Correct: (100, 10) + (1, 10)
# Wrong:   (100, 10) + (10, 1)


'''
Q3. How does matrix addition relate to residual (skip) connections?
Ans. Residual connections add the input matrix back to the output matrix.
'''
# Example
# Output = F(X) + X
# Both matrices must have same shape


'''
Q4. Why is matrix addition considered a parallel operation?
Ans. Because all elements are added independently at the same time.
'''
# Example
# Each (i, j) element is added without depending on others


'''
Q5. What mistake do beginners often make with matrix addition?
Ans. Confusing matrix addition with matrix multiplication.
'''
# Example
# A + B → element-wise
# A · B → row-column multiplication (different operation)


'''
Q6. How does matrix addition help combine signals in AI models?
Ans. It merges multiple information sources by adding their matrices.
'''
# Example
# signal1 = model_output
# signal2 = residual
# combined = signal1 + signal2


q='''
Q7. What is the key mental rule for matrix addition?
Ans. Same shape, element-wise addition.
'''
# Example
# Shape (m, n) + Shape (m, n) → valid


### Simple Matrix addition

In [5]:
import numpy as np

A = np.array([[1, 2],
              [3, 4]])

B = np.array([[5, 6],
              [7, 8]])

print(A + B)
# [[ 6  8]
#  [10 12]]

# Element-wise, position by position.


[[ 6  8]
 [10 12]]


### Addition allowed when?

In [None]:
A = np.zeros((2, 2))
B = np.ones((2, 2))

print(A + B)   # allowed

# Allowed as both have same shape (2,2)

### Addition not allowed when?

In [7]:
A = np.zeros((2, 2))
B = np.ones((1, 2))

# A + B  ❌ (mathematically invalid)
# Not allowed as shapes differ (2,2) vs (1,2)

# Matrix addition requires same shape
# Broadcasting is a programming feature, not a math rule

### Shapes different, but matrix addition happens

In [12]:
# Eg.
# Bias addition via broadcasting

XW = np.random.rand(100, 10)
b  = np.random.rand(1, 10)

Z = XW + b   # b is broadcast to (100, 10)
Z[:1] 

array([[1.65108531, 0.98442258, 1.37145973, 1.32800342, 0.63694873,
        1.32999735, 1.02286444, 1.45448701, 0.85665915, 0.80531374]])

### Concise Summary

In [13]:
cs = '''
Matrix Addition:

• Element-wise operation
• Requires same shape (or broadcast-compatible in code)

Pure math rule:
(m, n) + (m, n)

AI usage:
- Bias addition
- Residual connections
- Signal merging

Mental model:
Same shape → add element-wise
'''
