# Neural Networks 0-1


## Lets start with computing output of a single neuron. the formula goes as:

$$ \text{neuron output} = \mathbf{input} \cdot \mathbf{weight} + bias $$


In [None]:
X = 0.67378
w = 0.54623
b = 0.64272
nop = w * X + b
nop

1.0107588494

## Now, lets compute multiple neurons at once, also known as a layer of neurons.

$$ \text{output} = \sum_{i=0}^{n} (\text{input}_i \cdot \text{weight}_i) + \text{biases} $$

Here input and weight are arrays of float values of which dot product is calculated.

In [None]:
import numpy as np

In [None]:
X = [0.34 , 0.45 , 0.69]
w = [0.24 , 0.53 , 0.78]
b = 0.72
op = np.dot(X,w)+b
print(op)


1.5783


## ReLU Activation Function.
$$ \text{ReLU}(x) = \max(0, x) $$

ReLU is mostly used to activate hidden layers.

In [None]:
ReLu = np.maximum(0, op)
ReLu

1.5783

## SoftMax Activation

$$ \text{Softmax}(z_i) = \frac{e^{z_i}}{\sum_{j} e^{z_j}} $$

In [None]:
inputs = np.random.randn(1,4)
exp_value = np.exp(inputs - np.max(inputs,axis=1,keepdims = True))
probabilities = exp_value / np.sum(exp_value,axis=1,keepdims= True)
probabilities

array([[0.13949236, 0.49890119, 0.28791965, 0.07368679]])

## Loss Calculation (Categorical Cross-Entropy)

$$ \text{Loss} = -\sum_{i} y_i \log(p_i) $$

In [None]:
true_value = np.array([0,1,0,0])
probabilities = np.clip(probabilities, 1e-15, 1 -1e-15)
loss = -np.sum(true_value * np.log(probabilities))
loss

0.6953472154376993

## Adam Optimizer

### Compute the biased first moment estimate $m_t$
$$
m_t = \beta_1 \cdot m_{t-1} + (1 - \beta_1) \cdot g_t
$$

- where $g_t$ is the gradient at time step $t$.

### Compute the biased second raw moment estimate $v_t$:

$$
v_t = \beta_2 \cdot v_{t-1} + (1 - \beta_2) \cdot g_t^2
$$
### Compute bias-corrected first moment estimate $\hat{m}^t$:

$$\hat{m}_t = \frac{m_t}{1 - \beta_1^t}$$

### Compute bias-corrected second raw moment estimate $\hat{v}^t$:
$$m_t = \beta_1 \cdot m_{t-1} + (1 - \beta_1) \cdot g_t $$

### Update the parameters $θ$:
$$\theta_t = \theta_{t-1} - \frac{\alpha \cdot \hat{v}_t}{\sqrt{\hat{m}_t} + \epsilon}$$