# Activation Functions

<p>An activation function is a function that maps a node's inputs to its corresponding output.
    
<p> The weighted sum of each incoming connection for each node in the layer is taken, and pass that weighted sum to an activation function.</p>

<p style="text-align:center; border: 2px solid red;">node output = activation(weighted sum of inputs)</p>

The activation function does some type of operation to transform the sum to a number that is often times between some lower limit and some upper limit. This transformation is often a non-linear transformation. Some of the activations functions are:


</p>

<li>Sigmoid</li>
<li>tanh</li>
<li>ReLU</li>

<h4>ReLU</h4>
<p>ReLU, which is short for rectified linear unit, transforms the input to the maximum of either zero or the input itself.

<p style="text-align:center; border: 2px solid red;">ReLU(x) = max(0, x)</p>

<img src="relu-graph.png" style="height:300px;width:600px;align:center" title="ReLU Graph"/>

If the input is less than or equal to zero, then relu will output zero. If the input is greater than zero, relu will then just output the given input.<br>
The idea here is, the more positive the neuron is, the more activated it is.

An important feature of linear functions is that the composition of two linear functions is also a linear function. This means that, even in very deep neural networks, if we only had linear transformations of our data values during a forward pass, the learned mapping in our network from input to output would also be linear.

Typically, the types of mappings that we are aiming to learn with our deep neural networks are more complex than simple linear mappings.

This is where activation functions come in. Most activation functions are non-linear, and they are chosen in this way on purpose. Having non-linear activation functions allows our neural networks to compute arbitrarily complex functions.
</p>




# Loss Functions

<p>
The loss function is what SGD is attempting to minimize by iteratively updating the weights in the network.

At the end of each epoch during the training process, the loss will be calculated using the network’s output predictions and the true labels for the respective input.
</p>

<p>Some of the loss functions are: 
<li>mean_squared_error</li>
<li>mean_absolute_error</li>
<li>categorical_hinge</li>
<li>logcosh</li>
<li>categorical_crossentropy</li>
<li>sparse_categorical_crossentropy</li>
<li>binary_crossentropy</li>
    
</p>

<h4>categorical_crossentropy</h4>
<p>
Mathematically,<br>
<img src="crossEntropy.png" style="align:center" title="Categorical cross entrophy"/><br>
where p  are the predictions, t are the targets, i denotes the data point and j denotes the class.
</p>


# Gradient Descent Optimizers

<p>
Gradient descent is probably the most popular and widely used out of all optimizers. 

It is a simple and effective method to find the optimum values for the neural network. The objective of all optimizers is to reach the global minima where the cost function attains the least possible value. If you try to visualize the cost function in three-dimension it would something like the figure shown below.
    
<img src="gd.jpg" style="align:center" title="Gradient Descent"/><br>
    
Some of the Gradient Descent Optimizers are:
<li>Momentum Optimization</li>
<li>RMSProp</li>
<li>Adam</li>
    
<h4>Adam</h4>
<p>Combination of Momentum Optimization and RMSProp</p>
</p>

<h3>Sources</h3>
<p>
   
<li>Siraj Rival (Youtube)</li>
<a href="https://www.youtube.com/watch?v=FTr3n7uBIuE&t=1s" style="margin-left:30px;">Convolution Neural Network</a><br>
<a href="https://www.youtube.com/watch?v=-7scQpJT7uo&t=1s" style="margin-left:30px;">Activation Functions</a> <br>
<a href="https://www.youtube.com/watch?v=IVVVjBSk9N0" style="margin-left:30px;">Loss Functions</a>
    
<li><a href="https://www.youtube.com/watch?v=JXQT_vxqwIs&t=1s">Deeplearning.ai (Youtube)</a></li>
    
<li><a href="https://www.youtube.com/watch?v=umGJ30-15_A&t=1s">edureka!</a></li>
    
<li><a href="https://www.youtube.com/playlist?list=PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU">deeplizard</a></li>


    
</p>