In [1]:
#1- Explain the role of activation functions in neural networks. Compare and contrast linear and nonlinear activation functions. Why are nonlinear activation functions preferred in hidden layers?

In [None]:
#Activation functions are critical for introducing nonlinearity to neural networks, allowing them to learn complex patterns. Linear activation functions are limited to modeling linear relationships, while nonlinear activation functions, such as ReLU and Sigmoid, enable the network to handle more complex tasks. Nonlinear activation functions are preferred in hidden layers because they allow the network to capture intricate patterns in the data and improve learning and performance.

In [None]:
#2- Describe the Sigmoid activation function. What are its characteristics, and in what type of layers is it commonly used? Explain the Rectified Linear Unit (ReLU) activation function. Discuss its advantages and potential challenges.What is the purpose of the Tanh activation function? How does it differ from the Sigmoid activation function?

In [None]:
#Sigmoid: Outputs between 0 and 1, ideal for binary classification tasks, but suffers from vanishing gradients and non-zero-centered outputs.
#ReLU: Outputs positive values and zero, commonly used in hidden layers for deep networks due to its efficiency and ability to avoid vanishing gradients. However, it can lead to "dead" neurons with the Dying ReLU problem.
#Tanh: Outputs between -1 and 1 and is zero-centered, making it better for some hidden layers than Sigmoid, but it still faces the vanishing gradient issue.

In [2]:
#3- Discuss the significance of activation functions in the hidden layers of a neural network.

In [None]:
#Activation functions in hidden layers are fundamental to the success of neural networks, as they allow the network to model complex relationships, improve learning, and enable efficient gradient propagation. Without them, networks would not be able to solve anything beyond linear problems. The choice of activation function in hidden layers affects learning efficiency, model performance, and convergence speed, making it a key consideration in neural network design.

In [None]:
#4- Explain the choice of activation functions for different types of problems (e.g., classification, regression) in the output layer.

In [None]:
#Choosing the correct activation function in the output layer is crucial because it directly affects how the model interprets its predictions. For classification tasks, Sigmoid and Softmax are used to output probabilities for discrete categories, while for regression tasks, a linear activation is used to predict continuous values. The activation function must align with the problem's nature to ensure correct model performance and interpretability of predictions.

In [None]:
#5- Experiment with different activation functions (e.g., ReLU, Sigmoid, Tanh) in a simple neural network architecture. Compare their effects on convergence and performance.

In [8]:
import warnings
warnings.filterwarnings("ignore")

In [10]:
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler

data = load_iris()
X = data.data
y = data.target

y = LabelEncoder().fit_transform(y)

scaler = StandardScaler()
X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

# Function to create and train the model with different activation functions
def create_model(activation_function):
    model = Sequential()
    model.add(Dense(10, input_dim=4, activation=activation_function))
    model.add(Dense(3, activation='softmax'))
    model.compile(optimizer=Adam(), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

# Activation functions to experiment with
activation_functions = ['relu', 'sigmoid', 'tanh']

# Train models with different activation functions and compare their performance
for activation in activation_functions:
    print(f"\nTraining model with {activation} activation function:")
    model = create_model(activation)
    model.fit(X_train, y_train, epochs=100, batch_size=10, verbose=0)  # Training without verbose output
    loss, accuracy = model.evaluate(X_test, y_test, verbose=0)
    print(f"Test Accuracy: {accuracy:.4f}, Test Loss: {loss:.4f}")



Training model with relu activation function:
Test Accuracy: 0.9556, Test Loss: 0.1417

Training model with sigmoid activation function:
Test Accuracy: 0.8667, Test Loss: 0.3483

Training model with tanh activation function:
Test Accuracy: 0.9333, Test Loss: 0.2126
