# Activation Functions in Neural Networks

Activation functions decide whether a neuron should be activated or not. They introduce **non-linearity** into the network, helping it learn complex patterns.

In this notebook, we will cover:
- Sigmoid
- Hyperbolic Tangent (Tanh)
- ReLU (Rectified Linear Unit)
- Leaky ReLU
- Softmax


## 1. Import Libraries

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid")

## 2. Sigmoid Function

$$
\sigma(x) = \frac{1}{1 + e^{-x}}
$$

- Output range: (0,1)
- Good for probabilities
- Problem: vanishing gradients for very large/small x

In [None]:
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

x = np.linspace(-10, 10, 200)
y = sigmoid(x)

plt.plot(x, y)
plt.title("Sigmoid Function")
plt.xlabel("x")
plt.ylabel("σ(x)")
plt.show()

## 3. Tanh Function

$$
\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
$$

- Output range: (-1, 1)
- Zero-centered (better than sigmoid)
- Still suffers from vanishing gradient

In [None]:
y = np.tanh(x)

plt.plot(x, y, color="orange")
plt.title("Tanh Function")
plt.xlabel("x")
plt.ylabel("tanh(x)")
plt.show()

## 4. ReLU (Rectified Linear Unit)

$$
ReLU(x) = max(0, x)
$$

- Output range: [0, ∞)
- Very popular in deep learning
- Solves vanishing gradient (mostly)
- Problem: Dying ReLU (neurons stuck at 0)

In [None]:
def relu(x):
    return np.maximum(0, x)

y = relu(x)

plt.plot(x, y, color="green")
plt.title("ReLU Function")
plt.xlabel("x")
plt.ylabel("ReLU(x)")
plt.show()

## 5. Leaky ReLU

$$
LeakyReLU(x) = \begin{cases} x & x>0 \\ 0.01x & x \leq 0 \end{cases}
$$

- Allows small gradient when x < 0
- Solves the dying ReLU problem

In [None]:
def leaky_relu(x, alpha=0.01):
    return np.where(x > 0, x, alpha * x)

y = leaky_relu(x)

plt.plot(x, y, color="red")
plt.title("Leaky ReLU Function")
plt.xlabel("x")
plt.ylabel("LeakyReLU(x)")
plt.show()

## 6. Softmax

$$
Softmax(z_i) = \frac{e^{z_i}}{\sum_j e^{z_j}}
$$

- Converts vector into probability distribution
- Commonly used in **multi-class classification**

In [None]:
def softmax(z):
    exp_z = np.exp(z - np.max(z))
    return exp_z / exp_z.sum(axis=0)

z = np.array([2.0, 1.0, 0.1])
probs = softmax(z)
print("Softmax Probabilities:", probs)

plt.bar(["Class 1", "Class 2", "Class 3"], probs, color=["blue", "orange", "green"])
plt.title("Softmax Output")
plt.ylabel("Probability")
plt.show()

## ✅ Summary
- **Sigmoid** → outputs (0,1), good for binary classification
- **Tanh** → outputs (-1,1), better than sigmoid
- **ReLU** → default in deep learning, fast & simple
- **Leaky ReLU** → fixes dying ReLU
- **Softmax** → multi-class probability output