https://mohitmishra786687.medium.com/activation-functions-in-neural-networks-56dc526ad63c

# Activation Function
Activation functions apply a non-linear transformation and decide whether a neuron should be activated or not.

- Without activation function, the neural network is basically just a stacked linear regression model.

- A linear regression model predicts you a straight line that can only separae lineraly separable data.

- If the true relationship is curved or complex, a straight line cannot classify it correctly.

![activation](https://miro.medium.com/1*AsjGnS6iOgsA5RecS_ig5Q.png)



# Popular Activation Function
- Step Function
- Sigmoid
- TanH
- ReLU
- Leaky ReLU
- Softmax

## Step Function
Not used in practice any more.
It basically tells whether a neuron shall be activated or not.

![step](https://lh4.googleusercontent.com/proxy/EYhNGpiTTIdKyxjfWRdEnlsQJHOvvaeD0nOiXgVOr5XigTe3Meu5iCHuutEj5qmdyKzssXSzBNgNnYoQeNApw3g3ywg)



## Sigmoid
It is used for binary classification.

![sigmoid](https://media.geeksforgeeks.org/wp-content/uploads/20250131185746649092/Sigmoid-Activation-Function.png)

## TanH
It is scaled sigmoid function. It is valued between -1 and 1. It is good choice for **hidden layer**.

![tanh](https://miro.medium.com/v2/resize:fit:756/1*tOc--h-QU9_bHqWLPY9YLA.png)

## ReLU
It is zero for negative, but linear for positive value. It is used for hidden layer. if you do not know which activation function to use, simply use ReLU.

![relu](https://www.dailydoseofds.com/content/images/2023/06/relu-graph-1-1.jpeg)

# Leaky ReLU

It solves the vanishing gradient problem.

![leaky](https://www.i2tutorials.com/wp-content/media/2019/09/Deep-learning-25-i2tutorials.png)

# Softmax

It is used for multi-class classification.

![softmax](https://i0.wp.com/sefiks.com/wp-content/uploads/2017/11/softmax1.png?resize=850%2C329&ssl=1)

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# option1
class NeuralNet(nn.Module):
  def __init__(self, input_size, hidden_size):
    super(NeuralNet, self).__init__()
    self.linear1 = nn.Linear(input_size, hidden_size)
    self.relu = nn.ReLU()
    self.linear2 = nn.Linear(hidden_size, 1)
    self.sigmoid = nn.Sigmoid()

  def forward(self, x):
    out = self.linear1(x)
    out = self.relu(out)
    out = self.linear2(out)
    out = self.sigmoid(out)
    return out

# option2
class NeuralNet2(nn.Module):
  def __init__(self, input_size, hidden_size):
    super(NeuralNet, self).__init__()
    self.linear1 = nn.Linear(input_size, hidden_size)
    self.linear2 = nn.Linear(hidden_size, 1)
    self.relu = nn.ReLU()
    self.sigmoid = nn.Sigmoid()

  def forward(self, x):
    out = torch.relu(self.linear1(x))
    out = torch.sigmoid(self.linear2(out))
    return out