# Activation Functions:
Activation functions are mathematical  functions applied to the outputs of individual neurons in a neural network. They introduce non-linearity into the the network, allowing it to learn and approximate complex relationships between inputs and outputs.

Some commonly used activation functions in deep learning:
    
(1). Sigmoid function(Logistic function):It maps the input to a value between 0 and 1. It was widely used in the past but is now less popular due to some  drawbacks such as vanishing gradients.
    
(2). Hyperbolic tangent function(tanh): Similar to the sigmoid function ,but it maps the input to a value between -1 and 1.It is still used in some cases, but it also suffers from vanishing gradient.
    
(3). Rectified Linear Unit (ReLU): This function sets all negative values to zero and keeps positive values unchanged. It is the most popular activation function in deep learning due to its simplicity and effectiveness in training deep neural networks.
    
(4). Leaky ReLU: This function is similar to ReLu but allows a small negative input values.It helps mitigate the "dyingReLU" probelm where some neurons can become permanently inactive during training.
    
(5). Parametric ReLU(PReLU):PReLU is a generalization of ReLU that introduces a learnable parameter to determine the slope of negative input values.It offers more flexibility and can improve model performance.
    
(6). Exponential Linear Unit (ELU):ELU is a variation of ReLU that alloes a smooth exponential decay.It helps alleviate the dying ReLU problem and can produce more robust models.
    
(7). Softmax: Softmax is commonly used in the output layer of a neural network for multi-class classification problems.It normalizes the output values to represnt probablity, ensuring that the sum of all probablities is 1. 


# Activation functions are essential in deep learning for the following reasons:

(1). Non-Linearity: Activation functions introduce non-linear transformation to the network, enabling it to learn complex patterns and relationships in the data. Without acitvation functions ,a neural network would simply be a linear model.
    
(2). Gradient propogation: Activation functions help propogate gradients backward during the training process, allowing efficient optimization and learning.Different activation functions have different characterstics in terms of gradient behaiour.which can impact the model's training dynamics.
    
(3). Model capacity: The choice of activation function can influence the capacity and expressive power of a neural network. Non-Linear activation functions enable the network to represent more complex functions, expanding its ability to learn and generalize.

# 1. Sigmoid:

In [23]:
import numpy as np
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

#Example usage 
x=2.5
result=sigmoid(x)
print("Sigmoid(",x,")=",result)

Sigmoid( 2.5 )= 0.9241418199787566


In [24]:
import numpy as np
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense,Activation
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder,StandardScaler

In [29]:
# Generate Classification dataset
x,y=make_classification(n_samples=1000,n_features=10,n_informative=5,random_state=42)

In [30]:
#Split the data into training and testing sets
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2,random_state=42)

In [31]:
#Build the neural network model
model=Sequential()
model.add(Dense(64,activation='tanh',input_shape=(x_train.shape[1],)))#input layer
model.add(Dense(64,activation='tanh'))#hiden layer
model.add(Dense(1,activation='sigmoid'))#target layer

In [32]:
#Compile the model
model.compile(optimizer='adam',loss='binary_crossentropy',metrics=['accuracy'])




In [33]:
#Train the model
model.fit(x_train,y_train,epochs=10,batch_size=32)

Epoch 1/10


Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x23b8efde850>

In [35]:
#Evaluate the model on the testing data
loss,accuracy=model.evaluate(x_test,y_test)
print("Test Loss",loss)
print("Test Accuracy",accuracy)

Test Loss 0.24464663863182068
Test Accuracy 0.8999999761581421


# 2. ReLU :
    

In [45]:
import tensorflow as tf
import numpy as np
from tensorflow import keras

In [46]:
#Define the neural network architechture
input_size=4
hidden_size=8
output_size=2

In [47]:
#Define the model
model=keras.Sequential([
    keras.layers.Dense(hidden_size,activation='relu',input_shape=(input_size,)),
    keras.layers.Dense(output_size)
])

In [48]:
#Compile the model
model.compile(optimizer=keras.optimizers.SGD(learning_rate=0.01),
             loss=keras.losses.MeanSquaredError())

In [49]:
#Define your input and target data as numpy arrays
input_data=np.array([[1.0,2.0,3.0,4.0],
                   [2.0,3.0,4.0,5.0],
                   [3.0,4.0,5.0,6.0]])
target_data=np.array([[0.5,0.8],
                    [0.6,0.9],
                    [0.7,1.0]])

In [50]:
#Train the model
model.fit(input_data,target_data,epochs=10,batch_size=32)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x23b9158ee90>

In [43]:
#Test the model
test_input=np.array([[1.0,2.0,3.0,4.0]])
predicted_output=model.predict(test_input)



In [44]:
print("Predicted Output:",predicted_output)

Predicted Output: [[0.32051203 0.75245047]]


# 3. Leaky RLU 

In [2]:
import tensorflow as tf
input_tensor=tf.constant([-1.0,2.0,-0.5,3.0])

output_tensor=tf.nn.leaky_relu(input_tensor,alpha=0.2)

print(output_tensor.numpy())

[-0.2  2.  -0.1  3. ]


In [1]:
import tensorflow as tf
import numpy as np




In [3]:
#Define the neural network architecture
input_size=4
hidden_size=8
output_size=2

In [4]:
#Create the input and target tensors
inputs=tf.keras.Input(shape=(input_size,))
targets=tf.keras.Input(shape=(output_size,))




In [5]:
#Define the weights and biases for the hidden layer
hidden_weights = tf.Variable(tf.random.normal(shape=(input_size,hidden_size)))
hidden_biases=tf.Variable(tf.zeros(shape=(hidden_size,)))

In [7]:
#Compute nthe hidden layer output with laeky RelU activation function
hidden_layer_output=tf.nn.leaky_relu(tf.matmul(inputs,hidden_weights)+hidden_biases,alpha=0.2)



In [8]:
#Define the weigths and biases for the Leaky ReLU activation funcction
output_weights=tf.Variable(tf.random.normal(shape=(hidden_size,output_size)))
output_biases=tf.Variable(tf.zeros(shape=(output_size,)))

In [9]:
#Compute the final output
output=tf.matmul(hidden_layer_output,output_weights)+output_biases



In [10]:
# Define the loss function
loss=tf.reduce_mean(tf.square(output-targets))

In [11]:
#Define the optimizer
optimizer=tf.keras.optimizers.SGD(learning_rate=0.01)

In [12]:
#Create the model
model=tf.keras.Model(inputs=[inputs,targets],outputs=output)
model.add_loss(loss)




In [13]:
#Compile the model
model.compile(optimizer=optimizer)

In [14]:
#Define your input and target data as numpy arrays
input_data=np.array([[1.0,2.0,3.0,4.0]])#replace with your input data
target_data=np.array([[0.5,0.8]])#replace with your taregt data

In [15]:
#Train the model
model.fit([input_data,target_data],epochs=1000,verbose=0)

<keras.src.callbacks.History at 0x2251ab498d0>

In [16]:
#test the trained network
test_input=np.array([[1.0,2.0,3.0,4.0]])
test_target=np.array([[0.0,0.0]])#dummy target for prediction
predicted_output=model.

# 4. Parametric ReLU

# The Parametric Rectified Linear Unit activation function is an extension of the Leaky ReLU activation function that allows the slope of the negative part of the function to be learned during the training process.Instead of using a fixed slope value,PReLU intoduces a set of learnable parameters that control the slope.

The PReLU function is defined as follows:

PReLU(x)=max(0,x)+alpha*min(0.x)

where "x" id=s the input value and 'alpha' is a learnable vector . 

# 6. SoftMax Function

The softmax function is primarily used in the output layer of a neural network for multi-class classification problems.It provides a convenient way to interpret the outputs as probabilities and make predictions based on the highest probability calss. The most common loss function used with softmax is the categorical cross-entropy loss,which measures the difference between the predicted probabilities and the true class labels.

The softmax activation function is a commonly used activation function in the output layer of a layer of a neural network, particuraly for multi-class classification problems. It is designed to produce a probability distribution over mutliple classes, where the output values sum up to 1. 

In [1]:
import tensorflow as tf




In [2]:
#Define input data
input_data=[[0.5,1.2,0.8],[2.1,0.3,1.5]]

In [4]:
#Create a Sequential model
model=tf.keras.Sequential()




In [5]:
#Add softmax activation function
model.add(tf.keras.layers.Dense(3,input_shape=(3,)))

In [6]:
#Apply softmax activation function
model.add(tf.keras.layers.Activation('softmax'))

In [7]:
#Print model summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 3)                 12        
                                                                 
 activation (Activation)     (None, 3)                 0         
                                                                 
Total params: 12 (48.00 Byte)
Trainable params: 12 (48.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [8]:
#Compute softmax output
softmax_output=model.predict(input_data)



In [9]:
print("Softmax Output:")
print(softmax_output)

Softmax Output:
[[0.6571169  0.19742642 0.14545669]
 [0.2208712  0.73685867 0.04227015]]
