# Neural Networks 0-1


## Lets start with computing output of a single neuron. the formula goes as:

$$ \text{neuron output} = \mathbf{input} \cdot \mathbf{weight} + bias $$


In [1]:
w= 0.56473
X= 0.25675
z= 0.56475
nop=w *X+z
print(nop)

0.7097444275


## Now, lets compute multiple neurons at once, also known as a layer of neurons.

$$ \text{output} = \sum_{i=0}^{n} (\text{input}_i \cdot \text{weight}_i) + \text{biases} $$

Here input and weight are arrays of float values of which dot product is calculated.

In [2]:
import numpy as np

In [3]:
X=[0.123,0.586,0.256,0.846,0.564]
w=[0.25,0.36,0.25,0.56,0.96]
b=0.04
op=np.dot(X,w)+b
print(op)

1.36091


## ReLU Activation Function.
$$ \text{ReLU}(x) = \max(0, x) $$

ReLU is mostly used to activate hidden layers.

In [4]:
ReLU=np.maximum(0,op)
print(ReLU)

1.36091


## SoftMax Activation

$$ \text{Softmax}(z_i) = \frac{e^{z_i}}{\sum_{j} e^{z_j}} $$

In [5]:
input=np.random.randn(1,4)
exp_values=np.exp(input -np.max(input, axis=1, keepdims=True))
probabilities=exp_values /np.sum(exp_values, axis=1, keepdims=True)
probabilities


array([[0.34420316, 0.57468881, 0.0578728 , 0.02323523]])

## Loss Calculation (Categorical Cross-Entropy)

$$ \text{Loss} = -\sum_{i} y_i \log(p_i) $$

In [6]:
true_value= np.array([0,1,0,0])
probabilities=np.clip(probabilities, 1e-15, 1-1e-15)
loss= -np.sum(true_value * np.log(probabilities))
loss

0.5539265833774983

[link text](https:// [link text](https://))## Adam Optimizer

### Compute the biased first moment estimate $m_t$
$$
m_t = \beta_1 \cdot m_{t-1} + (1 - \beta_1) \cdot g_t
$$

- where $g_t$ is the gradient at time step $t$.

### Compute the biased second raw moment estimate $v_t$:

$$
v_t = \beta_2 \cdot v_{t-1} + (1 - \beta_2) \cdot g_t^2
$$
### Compute bias-corrected first moment estimate $\hat{m}^t$:

$$\hat{m}_t = \frac{m_t}{1 - \beta_1^t}$$

### Compute bias-corrected second raw moment estimate $\hat{v}^t$:
$$m_t = \beta_1 \cdot m_{t-1} + (1 - \beta_1) \cdot g_t $$

### Update the parameters $θ$:
$$\theta_t = \theta_{t-1} - \frac{\alpha \cdot \hat{v}_t}{\sqrt{\hat{m}_t} + \epsilon}$$