Skip to content

Latest commit

 

History

History

rectified-linear-unit-activation-function

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Rectified Linear Unit (ReLU) activation function

The Rectified Linear Unit (ReLU) activation function is a popular and widely used non-linear activation function in artificial neural networks. It is known for its simplicity and computational efficiency, and it has been shown to be effective in deep learning models.

Function: f(x) = max(0, x)

  • x is the input to the neuron.

  • f(x) is the output of the neuron after applying the ReLU activation function.

The ReLU function is piecewise linear, and it works by setting all negative values in the input to 0, while leaving positive values unchanged. Mathematically, it behaves like a linear function for x >= 0 and returns 0 for x < 0.

Key characteristics…

Non-Linearity: Although ReLU is a linear function for positive inputs, it introduces non-linearity into the neural network. This non-linearity allows neural networks to learn and approximate complex, nonlinear relationships in the data.

Sparsity: ReLU introduces sparsity in the network, as it sets negative values to 0. Sparse representations can be beneficial in reducing the computational load during training and inference.

Vanishing Gradient Problem: One issue with ReLU is the potential for dead neurons during training. If a large gradient flows through a ReLU neuron and pushes its weights into a region where the neuron always outputs 0, the neuron becomes inactive (i.e., dead) for all subsequent data points. To address this problem, several variants of ReLU have been proposed, such as Leaky ReLU, Parametric ReLU, and Exponential Linear Unit (ELU).