## PyTorch provides several activation functions, each suited for different types of neural network architectures. Here are some commonly used ones:

### **1. ReLU (Rectified Linear Unit)**
   - `nn.ReLU()`
   - Outputs **0** for negative inputs and keeps positive values unchanged.
   - Helps mitigate the **vanishing gradient problem**.

### **2. Leaky ReLU**
   - `nn.LeakyReLU(negative_slope)`
   - Similar to ReLU but allows a small negative slope for negative inputs.
   - Prevents **dying neurons** issue.

### **3. Sigmoid**
   - `nn.Sigmoid()`
   - Maps input values between **0 and 1**.
   - Often used for **binary classification**.

### **4. Tanh (Hyperbolic Tangent)**
   - `nn.Tanh()`
   - Maps input values between **-1 and 1**.
   - Helps in **zero-centered activations**.

### **5. Softmax**
   - `nn.Softmax(dim)`
   - Converts logits into **probabilities** for multi-class classification.
   - Ensures outputs sum to **1**.

### **6. ELU (Exponential Linear Unit)**
   - `nn.ELU(alpha)`
   - Similar to ReLU but smooths negative values instead of setting them to zero.

### **7. SELU (Scaled Exponential Linear Unit)**
   - `nn.SELU()`
   - Self-normalizing activation function that helps stabilize training.

### **8. GELU (Gaussian Error Linear Unit)**
   - `nn.GELU()`
   - Used in **transformer models** like BERT.

### **9. Swish**
   - `nn.SiLU()`
   - A smooth, non-monotonic function that improves deep learning performance.

### **10. Hard Sigmoid, Hard Tanh, Hard Swish**
   - Approximations of their respective functions for **faster computation**.



In [1]:
import torch
import torch.nn as nn

input = torch.Tensor([1,-2,3,-5])

# ReLU

In [4]:
act = nn.ReLU()
output = act(input)
print(output)

tensor([1., 0., 3., 0.])


# Leaky ReLU

In [6]:
# negative_slope a=0.2
act = nn.LeakyReLU(0.2)
output = act(input)
print(output)

tensor([ 1.0000, -0.4000,  3.0000, -1.0000])


# Sigmoid

In [7]:
act = nn.Sigmoid()
output = act(input)
print(output)

tensor([0.7311, 0.1192, 0.9526, 0.0067])


# Tanh

In [8]:
act = nn.Tanh()
output = act(input)
print(output)

tensor([ 0.7616, -0.9640,  0.9951, -0.9999])


# Softmax

In [9]:
# dimension dim=0, dimension starts from 0
act = nn.Softmax(dim=0)
output = act(input)
print(output)

tensor([1.1846e-01, 5.8980e-03, 8.7534e-01, 2.9365e-04])


# ELU

In [10]:
act = nn.ELU(alpha=1.0)
output = act(input)
print(output)

tensor([ 1.0000, -0.8647,  3.0000, -0.9933])


# SELU

In [11]:
act = nn.SELU()
output = act(input)
print(output)

tensor([ 1.0507, -1.5202,  3.1521, -1.7463])


# SiLU - Swish

In [13]:
act = nn.SiLU()
output = act(input)
print(output)

tensor([ 0.7311, -0.2384,  2.8577, -0.0335])


# GELU

In [14]:
act = nn.GELU()
output = act(input)
print(output)

tensor([ 8.4134e-01, -4.5500e-02,  2.9959e+00, -1.1921e-06])


# Hard Sigmoid

In [15]:
act = nn.Hardsigmoid()
output = act(input)
print(output)

tensor([0.6667, 0.1667, 1.0000, 0.0000])


# Hard Tanh

In [16]:
act = nn.Hardtanh()
output = act(input)
print(output)

tensor([ 1., -1.,  1., -1.])


# Hard Swish

In [17]:
act = nn.Hardswish()
output = act(input)
print(output)

tensor([ 0.6667, -0.3333,  3.0000, -0.0000])
