<a href="https://colab.research.google.com/github/Natural-Language-Processing-YU/Exercises/blob/main/M8_Activation_Functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Activation Functions Overview

Activation functions play a crucial role in neural networks by introducing non-linearity, enabling the network to learn complex relationships between input and output. Here's a set of exercises for activation functions in Python, along with explanations, visualizations, and real-world examples:



## 1.0 Sigmoid Activation Function

The sigmoid function maps any input value to a value between 0 and 1, making it suitable for binary classification problems.

$$\sigma(x) = \frac{1}{1 + e^{-x}}$$

Range: (0, 1)  
Used in: Binary classification problems

### Definition and Visualization


In [None]:

import numpy as np
import matplotlib.pyplot as plt

def sigmoid(x):
    return 1 / (1 + np.exp(-x))

# Test the sigmoid function
x = np.linspace(-10, 10, 100)
y = sigmoid(x)

plt.plot(x, y)
plt.title('Sigmoid Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.grid(True)
plt.show()



### Real-World Example
The sigmoid function is commonly used in logistic regression models for binary classification problems, such as determining whether an email is spam or not.



## 2.0 ReLU (Rectified Linear Unit) Activation Function

ReLU is a popular activation function that introduces non-linearity and helps alleviate the vanishing gradient problem.

$$\text{ReLU}(x) = \max(0, x)$$

Range: [0, +∞)  
Used in: Hidden layers of most neural networks due to its simplicity and efficiency

### Definition and Visualization


In [None]:

def relu(x):
    return np.maximum(0, x)

# Test the ReLU function
x = np.linspace(-10, 10, 100)
y = relu(x)

plt.plot(x, y)
plt.title('ReLU Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.grid(True)
plt.show()



### Real-World Example
ReLU is extensively used in convolutional neural networks (CNNs) for image recognition tasks, such as classifying objects in images.



## 3.0 Leaky ReLU Activation Function

Leaky ReLU is a variation of ReLU that allows a small, non-zero gradient when the input is negative.

$$\text{Leaky ReLU}(x) = \begin{cases}
x, & \text{if } x > 0 \\
\alpha x, & \text{otherwise}
\end{cases}$$

Range: (-∞, +∞)  
Used in: Variants of ReLU to prevent the "dying ReLU" problem

### Definition and Visualization


In [None]:

def leaky_relu(x, alpha=0.01):
    return np.where(x > 0, x, alpha * x)

# Test the Leaky ReLU function
x = np.linspace(-10, 10, 100)
y = leaky_relu(x)

plt.plot(x, y)
plt.title('Leaky ReLU Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.grid(True)
plt.show()



### Real-World Example
Leaky ReLU is used in generative adversarial networks (GANs) to avoid the dying ReLU problem and improve the training stability.



## 4.0 Tanh (Hyperbolic Tangent) Activation Function

The tanh function maps input values to a range between -1 and 1, making it useful for classification problems.

$$\tanh(x) = \frac{e^{x} - e^{-x}}{e^{x} + e^{-x}}$$

Range: (-1, 1)  
Used in: Hidden layers and output layers for classification problems

### Definition and Visualization


In [None]:

def tanh(x):
    return np.tanh(x)

# Test the tanh function
x = np.linspace(-10, 10, 100)
y = tanh(x)

plt.plot(x, y)
plt.title('Tanh Activation Function')
plt.xlabel('Input')
plt.ylabel('Output')
plt.grid(True)
plt.show()



### Real-World Example
The tanh function is often used in recurrent neural networks (RNNs) for natural language processing tasks, such as sentiment analysis.



## 5.0 Softmax Activation Function

Softmax is used primarily in the output layer for multiclass classification problems. It converts raw scores to probability distribution.

$$\text{Softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}$$

Range: (0, 1) and the sum of all elements in the output vector is 1  
Used in: Multiclass classification problems (output layer)

### Definition and Visualization


In [None]:

def softmax(x):
    exp_x = np.exp(x - np.max(x))  # For numerical stability
    return exp_x / exp_x.sum()

# Test the softmax function
x = np.array([1, 2, 3, 4, 5])
y = softmax(x)

plt.bar(range(len(x)), y)
plt.title('Softmax Activation Function')
plt.xlabel('Class')
plt.ylabel('Probability')
plt.show()



### Real-World Example
Softmax is widely used in neural networks for multiclass classification tasks, such as classifying handwritten digits in the MNIST dataset.
