# W6 - Batch

This notebook contains personal notes and Python, Numpy and Latex of exercises from the "AI by Hand ✍️ Workbook" by Prof. Tom Yeh.


Reference: [AI by Hand - W6: Batch](https://www.byhand.ai/t/workbook)

**Contents:**
- Multiple input vectos 
- Numpy and the transponsed input matrix
- Batch 
    - Sum
    - Mean
    - Add 
    - Substract 
    - Multiply
    - Batch Center at Zero

## Multiple input vectors

X is the input matrix

$X = (x1, x2)$

$y = w*X+b$

In [14]:
# Exercise 1

import numpy as np

def ReLU(x):
    # DRY
    return np.maximum(0,x)


X = np.matrix([[1, 3],[2 ,0]])
W = np.matrix([2,1])
b = 0
result = ReLU(np.dot(W,np.transpose(X))+b)
print(result)


[[5 4]]


## Numpy and the transponsed input matrix

Can NumPy automatically handle matrix dimensions for dot products, or do we always need to explicitly transpose the input matrix?

In [20]:
# Exercise 3


X = np.matrix([[1, 2],[2 ,1], [1,1]])
W = np.matrix([1,2])
b = -1
result = ReLU(np.dot(W,X)+b)
print(result)

ValueError: shapes (1,2) and (3,2) not aligned: 2 (dim 1) != 3 (dim 0)

No

In [None]:
result = ReLU(np.dot(W,np.transpose(X))+b)
print(result)

[[4 3 2]]


## Batch

### Batch Sum 

Transforms the $Y$ matrix with $\mathbf{y}$ vectors (one per neuron) and length equal to the number of inputs $N$ into one output per neuron.

We have the outputs $y_{j,i}$, where $j$ is the neuron index and $i$ is the input index (e.g., user).

$Y =  \mathrm{ReLU}(W X^T + b) $

**Batch Sum in LaTeX:**
$$
S_j = \sum_{i=1}^N y_{j,i}
$$
where $S_j$ is the batch sum for neuron $j$.

- Transforming the bias array into a column vector and broadcasting it
- Why is this useful? Example: We could have 40 inputs and just 3 outputs. Some examples:
    - It is useful for Mini-Batch Gradient Descent for computing the total loss
    - For image classification. If you have one neuron per class, summing will give total activation per class.
    - For recommendations. 100 users and 10 neurons, one neuron per product, doing batch sum will give the most activated products for a group of users


In [90]:
# Exercise 11
X = np.matrix([[3, 1],[2 ,2], [3,1], [2,-2]])
W = np.matrix([[1,1],[1,-1],[0,2]])
b = np.array([0,0,-1])

pre_activation = np.dot(W,X.T) + b[:, np.newaxis]
print("Pre Activation")

print(pre_activation)

post_activation = ReLU(pre_activation) # size 3x4

print("Post Activation")

print(post_activation)

Pre Activation
[[ 4  4  4  0]
 [ 2  0  2  4]
 [ 1  3  1 -5]]
Post Activation
[[4 4 4 0]
 [2 0 2 4]
 [1 3 1 0]]


In [91]:
# New concept here. Reshaping the array into a column vector. This can be broadcasted in order to add it to all columns. Essentially expending its dimensions. 
print(b[:, np.newaxis])
print(b[np.newaxis, :])

[[ 0]
 [ 0]
 [-1]]
[[ 0  0 -1]]


In [92]:
# Exercise 11 + Batch Sum

# Batch Sum
ones = np.ones(4) # size 1X4
print(ones)
result = np.dot(post_activation,ones.T)
print(result)

[1. 1. 1. 1.]
[[12.  8.  5.]]


In [93]:
# Batch sum the numpy way
post_activation.sum(axis=1)

matrix([[12],
        [ 8],
        [ 5]])

### Batch Mean
$\bar{S}_j = \frac{1}{N} \sum_{i=1}^N y_{j,i}  $

In [94]:
post_activation.mean(axis=1)

matrix([[3.  ],
        [2.  ],
        [1.25]])

### Batch Add
$S_j = \sum_{i=1}^N (y_{j,i} + v_i) $


Where: 
- v is a vector size number of neurons
- N is number of inputs

In [82]:
add_vector = np.array([2,1,2])

add_vector= add_vector[:, np.newaxis]
print(post_activation)
print(add_vector)
print("Result")
print(post_activation + add_vector)

[[4 4 4 0]
 [2 0 2 4]
 [1 3 1 0]]
[[2]
 [1]
 [2]]
Result
[[6 6 6 2]
 [3 1 3 5]
 [3 5 3 2]]


### Batch Substract
$S_j = \sum_{i=1}^N (y_{j,i} - v_i) $

In [83]:
add_vector = np.array([2,1,2])

add_vector= add_vector[:, np.newaxis]
print(post_activation)
print(add_vector)
print("Result")
print(post_activation - add_vector)

[[4 4 4 0]
 [2 0 2 4]
 [1 3 1 0]]
[[2]
 [1]
 [2]]
Result
[[ 2  2  2 -2]
 [ 1 -1  1  3]
 [-1  1 -1 -2]]


###  Batch Center at Zero

In [89]:

print(post_activation)
print("Mean")
print(np.mean(post_activation, axis=1))

print("Result")
print(post_activation - np.mean(post_activation, axis=1))

[[4 4 4 0]
 [2 0 2 4]
 [1 3 1 0]]
Mean
[[3.  ]
 [2.  ]
 [1.25]]
Result
[[ 1.    1.    1.   -3.  ]
 [ 0.   -2.    0.    2.  ]
 [-0.25  1.75 -0.25 -1.25]]
