## Why batch?
- we can calculate these things in parallel.
  - The bigger the batch the more parallel operations that we can run.
- It helps with generalization.
###### Note: We tend to do neural networks on GPU instead of CPU.

<div style="display: inline-flex; align-items: center;">
  <!-- Video Thumbnail -->
  <a href="https://www.youtube.com/watch?v=s164HyJuL94" target="_blank" style="display: inline-block;">
    <img src="https://img.youtube.com/vi/s164HyJuL94/0.jpg" style="width: 100%; display: block;">
  </a>

  <!-- Play Button -->
  <a href="https://www.youtube.com/watch?v=s164HyJuL94" target="_blank" style="display: inline-block;">
    <img src="https://upload.wikimedia.org/wikipedia/commons/b/b8/YouTube_play_button_icon_%282013%E2%80%932017%29.svg" 
         style="width: 50px; height: auto; margin-left: 5px;">
  </a>
</div>

## How the Matrix Product works?

<div style="display: inline-flex; align-items: center;">
  <!-- Video Thumbnail -->
  <a href="https://www.youtube.com/watch?v=KBPvlUp-m5Y" target="_blank" style="display: inline-block;">
    <img src="https://img.youtube.com/vi/KBPvlUp-m5Y/0.jpg" style="width: 100%; display: block;">
  </a>

  <!-- Play Button -->
  <a href="https://www.youtube.com/watch?v=KBPvlUp-m5Y" target="_blank" style="display: inline-block;">
    <img src="https://upload.wikimedia.org/wikipedia/commons/b/b8/YouTube_play_button_icon_%282013%E2%80%932017%29.svg" 
         style="width: 50px; height: auto; margin-left: 5px;">
  </a>
</div>

In [2]:
import numpy as np

inputs = [[1, 2, 3, 2.5],
			[2.0, 5.0, -1.0, 2.0],
            [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
			[0.5, -0.91, 0.26, -0.5],
			[-0.26, -0.27, 0.17, 0.87]]
biases = [2, 3, 0.5]

output = np.dot(weights, inputs) + biases
print(output)

ValueError: shapes (3,4) and (3,4) not aligned: 4 (dim 1) != 3 (dim 0)

Above we got this error because it makes sense.

we got first row in **`inputs`** by first column in **`wights`**.

#### So to solve this we will use **Transpose**
**Transpose:** just swap rows and columns.

<div style="display: inline-flex; align-items: center;">
  <!-- Video Thumbnail -->
  <a href="https://www.youtube.com/watch?v=ZN60jdWk8aM" target="_blank" style="display: inline-block;">
    <img src="https://img.youtube.com/vi/ZN60jdWk8aM/0.jpg" style="width: 100%; display: block;">
  </a>

  <!-- Play Button -->
  <a href="https://www.youtube.com/watch?v=ZN60jdWk8aM" target="_blank" style="display: inline-block;">
    <img src="https://upload.wikimedia.org/wikipedia/commons/b/b8/YouTube_play_button_icon_%282013%E2%80%932017%29.svg" 
         style="width: 50px; height: auto; margin-left: 5px;">
  </a>
</div>

In [4]:
import numpy as np

inputs = [[1, 2, 3, 2.5],
			[2.0, 5.0, -1.0, 2.0],
            [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
			[0.5, -0.91, 0.26, -0.5],
			[-0.26, -0.27, 0.17, 0.87]]
biases = [2, 3, 0.5]

output = np.dot(inputs, np.array(weights).T) + biases
print(output)

[[ 4.8    1.21   2.385]
 [ 8.9   -1.81   0.2  ]
 [ 1.41   1.051  0.026]]


### Why using **Transpose**? and why **Weights** become on the right? 

all of these things explained here

<div style="display: inline-flex; align-items: center;">
  <!-- Video Thumbnail -->
  <a href="https://www.youtube.com/watch?v=2c9CJ_7YT8w" target="_blank" style="display: inline-block;">
    <img src="https://img.youtube.com/vi/2c9CJ_7YT8w/0.jpg" style="width: 100%; display: block;">
  </a>

  <!-- Play Button -->
  <a href="https://www.youtube.com/watch?v=2c9CJ_7YT8w" target="_blank" style="display: inline-block;">
    <img src="https://upload.wikimedia.org/wikipedia/commons/b/b8/YouTube_play_button_icon_%282013%E2%80%932017%29.svg" 
         style="width: 50px; height: auto; margin-left: 5px;">
  </a>
</div>

In [5]:
# We're gonna add another layer:

inputs = [[1, 2, 3, 2.5],
			[2.0, 5.0, -1.0, 2.0],
            [-1.5, 2.7, 3.3, -0.8]]

weights = [[0.2, 0.8, -0.5, 1.0],
			[0.5, -0.91, 0.26, -0.5],
			[-0.26, -0.27, 0.17, 0.87]]
biases = [2, 3, 0.5]

weights2 = [[0.1, -0.14, 0.5],
			[-0.5, 0.12, -0.33],
			[-0.44, 0.73, -0.13]]
biases2 = [-1, 2, -0.5]

layer1_outputs = np.dot(inputs, np.array(weights).T) + biases

layer2_outputs = np.dot(layer1_outputs, np.array(weights2).T) + biases2
print(layer2_outputs)

[[ 0.5031  -1.04185 -2.03875]
 [ 0.2434  -2.7332  -5.7633 ]
 [-0.99314  1.41254 -0.35655]]


## We can convert this layers to objects instead of written it like that:

In [41]:
np.random.seed(0)

X = [[1, 2, 3, 2.5],
	[2.0, 5.0, -1.0, 2.0],
	[-1.5, 2.7, 3.3, -0.8]]


class Layer_Dense:
    def __init__(self, n_inputs, n_neurons):
        self.weights = 0.10 * np.random.randn(n_inputs, n_neurons) # # randn gaussian distribution bounded around zero.
        self.biases = np.zeros((1, n_neurons)) # np.zeros is generating a zeros
    def forward(self, inputs):
        self.output = np.dot(inputs, self.weights) + self.biases

layer1 = Layer_Dense(4,5)
layer2 = Layer_Dense(5,2)

layer1.forward(X)
print(layer1.output)
print("-------------------")
layer2.forward(layer1.output)
print(layer2.output)


[[ 0.10758131  1.03983522  0.24462411  0.31821498  0.18851053]
 [-0.08349796  0.70846411  0.00293357  0.44701525  0.36360538]
 [-0.50763245  0.55688422  0.07987797 -0.34889573  0.04553042]]
-------------------
[[ 0.148296   -0.08397602]
 [ 0.14100315 -0.01340469]
 [ 0.20124979 -0.07290616]]


### Use ChatGPT for a clear explanation:
Let's break down how the output shape `(3, 5)` is determined when you call `layer1.forward(X)`.

1. **Input Matrix (`X`)**:
   - `X` is the input matrix with shape `(3, 4)`.
   - This shape represents 3 samples (rows) and 4 features (columns).

   ```python
   X = [[1, 2, 3, 2.5],
        [2.0, 5.0, -1.0, 2.0],
        [-1.5, 2.7, 3.3, -0.8]]
   ```
   
2. **Weights Matrix (`self.weights`)**:
   - `self.weights` is initialized with shape `(4, 5)`, meaning each of the 4 input features will connect to each of the 5 neurons in this layer.
   - The weights matrix shape `(4, 5)` is created based on `n_inputs=4` and `n_neurons=5`.

3. **Dot Product Operation**:
   - In the `forward` method, the output is calculated using:
     ```python
     self.output = np.dot(inputs, self.weights) + self.biases
     ```
   - Here, `np.dot(inputs, self.weights)` computes the dot product between `X` (shape `(3, 4)`) and `self.weights` (shape `(4, 5)`).

4. **Matrix Multiplication Shape Rules**:
   - When you multiply two matrices, the rule is that the **number of columns of the first matrix** must equal the **number of rows of the second matrix**.
   - Since `X` has 4 columns (features) and `self.weights` has 4 rows, this multiplication is valid.
   - The resulting shape after the dot product is `(3, 5)`:
     - **3**: From `X`'s 3 rows (samples).
     - **5**: From `self.weights`'s 5 columns (neurons).

5. **Adding Biases**:
   - The biases, which have shape `(1, 5)`, are added to each row of the resulting `(3, 5)` matrix from the dot product, keeping the final shape `(3, 5)`.

### Final Output Shape
So, `layer1.output` has shape `(3, 5)`, which means:
- There are **3 samples** in the output (one for each input sample in `X`).
- Each sample has **5 values**, representing the output from each of the 5 neurons in `layer1`.

In [25]:
print(0.10 * np.random.randn(4, 5))

[[ 0.01549474  0.03781625 -0.08877857 -0.19807965 -0.03479121]
 [ 0.0156349   0.12302907  0.12023798 -0.03873268 -0.03023028]
 [-0.1048553  -0.14200179 -0.17062702  0.19507754 -0.05096522]
 [-0.04380743 -0.12527954  0.07774904 -0.16138978 -0.02127403]]


In [38]:
koko= (layer1.weights)+layer1.biases
print(koko)

[[ 0.17640523  0.04001572  0.0978738   0.22408932  0.1867558 ]
 [-0.09772779  0.09500884 -0.01513572 -0.01032189  0.04105985]
 [ 0.01440436  0.14542735  0.07610377  0.0121675   0.04438632]
 [ 0.03336743  0.14940791 -0.02051583  0.03130677 -0.08540957]]
