In the simplest terms, an activation function is like a decision-maker for a neuron in a neural network. Let's break it down.

Imagine a neural network as a series of interconnected nodes (neurons) organized in layers. Each connection between nodes has a weight, and the neuron combines the inputs it receives, applies a function, and produces an output.

Now, without an activation function, the network would essentially be a linear model. It would be limited in its capacity to learn complex patterns and relationships in data. The activation function introduces non-linearity, allowing the network to learn from more intricate patterns.

Here's an analogy: Think of the activation function as the decision-making process of a neuron. If the input is above a certain threshold, it "fires" (produces an output), and if it's below, it doesn't. This non-linear behavior enables the neural network to capture and understand complex patterns, making it more powerful and adaptable.

In summary, activation functions add the ability for neural networks to learn and represent complex relationships in data, which is crucial for their effectiveness in tasks like image recognition, natural language processing, and many other applications.

In [1]:
import math

In [2]:
def sigmoid(x):
    return 1/(1+ math.exp(-x))

In [3]:
sigmoid(2)

0.8807970779778823

In [4]:
sigmoid(-2)

0.11920292202211755

In [5]:
sigmoid(-56)

4.780892883885469e-25

In [6]:
def tanh(x):
    return (math.exp(x)-math.exp(-x)) / (math.exp(x)+math.exp(-x))

In [7]:
tanh(-56)

-1.0

In [8]:
tanh(50)

1.0

In [9]:
tanh(1)

0.7615941559557649

In [10]:
def relu(x):
    return max(0,x)

In [11]:
relu(5)

5

In [12]:
relu(-1)

0

In [13]:
def leaky_relu(x):
    return max(0.1*x,x)

In [14]:
leaky_relu(-100)

-10.0

In [15]:
leaky_relu(8)

8