# Solve the Following Task

Question 1: Explain the basic structure and working of a simple Artificial Neural Network (ANN).
In your explanation, include the following points: (Marks 20)

● The role of neurons and layers in an ANN.
● How information flows through the network.
● The significance of activation functions and provide examples of commonly used activation
functions.
● The concept of weights and biases and their role in training an ANN.

# What is an Artificial Neural Network (ANN)?

# Ans:
An ANN is a system inspired by the way our brains work. It's made up of layers of "neurons" that work together to learn and make decisions.

# The role of neurons and layers in an ANN.

# Ans:

In an Artificial Neural Network (ANN), neurons and layers work together to process data. Neurons are basic units that receive input, process it, and pass the output to other neurons, detecting patterns in the data. Layers, which are groups of neurons, include an input layer, hidden layers, and an output layer. The input layer receives raw data, the hidden layers process this information, and the output layer produces the network's final decision or prediction. This structure enables ANNs to learn from data and perform complex tasks like image recognition and natural language processing.



# How information flows through the network.

#Ans:


Information flows through an Artificial Neural Network (ANN) in several steps. First, the input layer receives raw data, such as images or text. This data is then passed to the hidden layers, where each neuron processes the information by applying mathematical operations. Neurons in hidden layers adjust the strength of the connections using weights and biases, refining the data through multiple layers. Activation functions in neurons determine whether the processed signal should be passed to the next layer, introducing non-linearity and enabling the network to learn complex patterns. Finally, the processed information reaches the output layer, which generates the network's final decision or prediction. This flow of information from input to output allows the ANN to analyze and learn from data effectively.


The significance of activation functions and provide examples of commonly used activation
functions.

# Ans:

Activation functions are crucial in artificial neural networks (ANNs) because they introduce non-linearity, allowing the network to learn and model complex patterns in data. Without activation functions, an ANN would only be able to perform linear transformations, severely limiting its ability to solve real-world problems. Activation functions also help with learning by providing gradients needed for optimization algorithms like gradient descent, ensuring the network can adjust weights and biases during training.

Commonly used activation functions include:

1. Sigmoid Function: Outputs values between 0 and 1, making it useful for binary classification problems.

2. Tanh (Hyperbolic Tangent) Function: Outputs values between -1 and 1, helping with zero-centered data.

3. ReLU (Rectified Linear Unit): Outputs the input directly if it is positive; otherwise, it outputs zero. This helps with the vanishing gradient problem and is widely used in hidden layers of deep learning models.

4. Leaky ReLU: Similar to ReLU but allows a small, non-zero gradient when the input is negative, preventing dying neurons.

The concept of weights and biases and their role in training an ANN.

# Ans:

Weights and biases are fundamental components of an Artificial Neural Network (ANN) that play a crucial role in its ability to learn and make accurate predictions.

# Weights:
Weights are the parameters that determine the strength of the connection between neurons. They influence how much impact the input will have on the neuron's output. During training, the ANN adjusts the weights to minimize the error between the predicted output and the actual output. This process involves optimization algorithms, such as gradient descent, which iteratively update the weights to improve the network's performance.

# Biases:
Biases are additional parameters in each neuron that allow the activation function to be shifted left or right. They help the network model patterns that do not pass through the origin (0,0). Biases ensure that neurons can activate even when the input is zero, providing the network with the flexibility to learn a wider range of patterns.

# Role in Training:
During training, both weights and biases are adjusted based on the errors the network makes. This adjustment is done using backpropagation, where the network calculates the gradient of the loss function with respect to each weight and bias. By updating these parameters, the network learns to reduce the error, improving its ability to make accurate predictions on new data.

# Summary:
Weights determine the strength of connections between neurons, while biases allow neurons to activate under different conditions. Together, they enable the ANN to learn from data and improve its performance through training, making them essential for the network's ability to model complex patterns and make accurate predictions.

Question 2: Build a FCNN to recognize handwritten digits. Use MNIST handwritten digit data set.
Your network should have: (Marks 20)
● An input layer of size 784 (28x28 pixels)
● Two hidden layers with 128 neurons each and ReLU activation
● An output layer with 10 neurons (one for each digit) and softmax activation (Use accuracy as
an error measure).


In [None]:
# Impor libraries
import numpy as np
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.utils import to_categorical

In [None]:
# Load the MNIS dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [None]:
# Preprocess the data
x_train = x_train.reshape((x_train.shape[0], 28 * 28)).astype('float32') / 255
x_test = x_test.reshape((x_test.shape[0], 28 * 28)).astype('float32') / 255

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)


In [None]:
# Build the model
model = Sequential()
model.add(Dense(128, input_shape=(784,), activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)



Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7e0b1f29c8e0>

In [None]:
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f'Test accuracy: {test_acc:.4f}')

Test accuracy: 0.9748


# Question 3:
Explain the importance of hyperparameter tuning in training an ANN. List at least
three hyperparameters commonly tuned in ANN training and describe their impact on the
model's performance.

# Ans:


# Importance of Hyperparameter Tuning in Training an ANN

Hyperparameter tuning is crucial in training an Artificial Neural Network (ANN) because it directly affects the model's performance and ability to generalize to new data. Hyperparameters are settings that are not learned from the data but are set before the training process begins. Proper tuning of these hyperparameters ensures that the model can learn effectively, avoid overfitting, and achieve high accuracy. Without careful tuning, an ANN may underperform, either by failing to learn the patterns in the data or by overfitting, where it learns the training data too well but performs poorly on new, unseen data.

# Commonly Tuned Hyperparameters and Their Impact

1. Learning Rate:

#Description:
The learning rate determines the size of the steps the optimization algorithm takes to adjust the weights during training.

#Impact on Performance:
A learning rate that is too high can cause the model to converge too quickly to a suboptimal solution, missing the best possible outcome. Conversely, a learning rate that is too low can make the training process extremely slow and may get stuck in local minima. Proper tuning helps the model converge efficiently to a good solution.

# Example:
If the learning rate is set to 0.1, the model might overshoot the optimal parameters, while a learning rate of 0.0001 might make the training process too slow.

2. Batch Size:

# Description:
The batch size is the number of training examples used in one iteration to update the model’s weights.

#Impact on Performance:
A small batch size can provide a more accurate estimate of the gradient, leading to better convergence but increasing the computational cost and time. A large batch size can make the training process faster but might result in poorer generalization to new data. Finding a balance is key to efficient and effective training.

#Example:
A batch size of 32 might make the model train slower but more accurately, while a batch size of 1024 might speed up the training but reduce accuracy on the test data.

3. Number of Epochs:

#Description:
The number of epochs is the number of times the entire training dataset passes through the neural network during training.

#Impact on Performance:
Too few epochs can lead to underfitting, where the model does not learn enough from the data. Too many epochs can lead to overfitting, where the model learns the noise and specific details of the training data, which do not generalize well to new data. The right number of epochs ensures that the model has learned enough to perform well on both training and unseen data.

#Example:
Training for 5 epochs might not be enough for the model to learn the data patterns, while training for 1000 epochs might cause overfitting.

# Summary:

Hyperparameter tuning is essential for optimizing the performance of an ANN. By carefully adjusting hyperparameters like the learning rate, batch size, and number of epochs, you can significantly enhance the model’s ability to learn from training data and generalize to new data. Proper tuning helps achieve a balance between underfitting and overfitting, ensuring that the model performs well in real-world applications.

Question 4: Write a Python script to create and compare the performance of an ANN using
different activation functions (ReLU, Sigmoid, and Tanh) for the hidden layers. Use a simple
dataset (e.g., Iris dataset) to train and evaluate the models. Report the accuracy for each
activation function.

In [None]:
# Import libraries
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
from keras.utils import to_categorical

In [None]:
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target

In [None]:
# Preprocess the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

In [None]:
# One-hot encode the labels
y_encoded = to_categorical(y)

# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y_encoded, test_size=0.2, random_state=42)


In [None]:
# Function to create and train an ANN model with a specified activation function
def create_and_train_model(activation_function):
    model = Sequential()
    model.add(Dense(128, input_shape=(4,), activation=activation_function))
    model.add(Dense(128, activation=activation_function))
    model.add(Dense(3, activation='softmax'))

    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    model.fit(X_train, y_train, epochs=50, batch_size=5, verbose=0)

    y_pred = model.predict(X_test)
    y_pred_classes = np.argmax(y_pred, axis=1)
    y_test_classes = np.argmax(y_test, axis=1)

    accuracy = accuracy_score(y_test_classes, y_pred_classes)
    return accuracy

In [None]:
# Evaluate models with different activation functions
activation_functions = ['relu', 'sigmoid', 'tanh']
accuracies = {}

for activation in activation_functions:
    accuracy = create_and_train_model(activation)
    accuracies[activation] = accuracy
    print(f'Accuracy with {activation} activation: {accuracy:.4f}')

# Print the results
for activation, accuracy in accuracies.items():
    print(f'Accuracy with {activation} activation: {accuracy:.4f}')

Accuracy with relu activation: 1.0000
Accuracy with sigmoid activation: 0.9667
Accuracy with tanh activation: 1.0000
Accuracy with relu activation: 1.0000
Accuracy with sigmoid activation: 0.9667
Accuracy with tanh activation: 1.0000


Question 5: Weather Prediction Using Fully Connected Neural Networks. (Marks 20)
● Write code to load historical weather data from a CSV file.
● Preprocess the data by handling missing values, converting categorical variables to numerical,
and normalizing numerical features.
● Write code to create a neural network model using TensorFlow or Keras, with appropriate input,
hidden, and output layers.
● Write code to train the neural network model using the training data, and include validation to
monitor performance.
● Write code to evaluate the model on the test dataset and print the mean squared error.
● Experiment with different configurations of the model and hyperparameters to improve
performance. Document the impact of changes.
● Write code to predict weather conditions using new or unseen data and interpret the results.

In [16]:
import pandas as pd

# Load the data from a CVS file
data = pd.read_csv('/content/Weather Training Data.csv')

In [17]:
data.head()

Unnamed: 0,row ID,Location,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,Row0,Albury,13.4,22.9,0.6,,,W,44.0,W,...,71.0,22.0,1007.7,1007.1,8.0,,16.9,21.8,No,0.0
1,Row1,Albury,7.4,25.1,0.0,,,WNW,44.0,NNW,...,44.0,25.0,1010.6,1007.8,,,17.2,24.3,No,0.0
2,Row2,Albury,17.5,32.3,1.0,,,W,41.0,ENE,...,82.0,33.0,1010.8,1006.0,7.0,8.0,17.8,29.7,No,0.0
3,Row3,Albury,14.6,29.7,0.2,,,WNW,56.0,W,...,55.0,23.0,1009.2,1005.4,,,20.6,28.9,No,0.0
4,Row4,Albury,7.7,26.7,0.0,,,W,35.0,SSE,...,48.0,19.0,1013.4,1010.1,,,16.3,25.5,No,0.0


In [18]:
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 82384 entries, 0 to 82383
Data columns (total 23 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   row ID         82384 non-null  object 
 1   Location       82384 non-null  object 
 2   MinTemp        81991 non-null  float64
 3   MaxTemp        82178 non-null  float64
 4   Rainfall       81486 non-null  float64
 5   Evaporation    47144 non-null  float64
 6   Sunshine       43651 non-null  float64
 7   WindGustDir    76064 non-null  object 
 8   WindGustSpeed  76081 non-null  float64
 9   WindDir9am     76270 non-null  object 
 10  WindDir3pm     79889 non-null  object 
 11  WindSpeed9am   81499 non-null  float64
 12  WindSpeed3pm   80603 non-null  float64
 13  Humidity9am    81248 non-null  float64
 14  Humidity3pm    80482 non-null  float64
 15  Pressure9am    75447 non-null  float64
 16  Pressure3pm    75463 non-null  float64
 17  Cloud9am       52341 non-null  float64
 18  Cloud3

In [26]:
data.select_dtypes("object").isnull().sum()

Location       0
WindGustDir    0
WindDir9am     0
WindDir3pm     0
dtype: int64

In [25]:
from sklearn.preprocessing import StandardScaler, LabelEncoder

data.fillna(method='ffill', inplace=True)
data.fillna(method='bfill', inplace=True)

In [23]:
data.isnull().sum()

Location         0
MinTemp          0
MaxTemp          0
Rainfall         0
Evaporation      0
Sunshine         0
WindGustDir      0
WindGustSpeed    0
WindDir9am       0
WindDir3pm       0
WindSpeed9am     0
WindSpeed3pm     0
Humidity9am      0
Humidity3pm      0
Pressure9am      0
Pressure3pm      0
Cloud9am         0
Cloud3pm         0
Temp9am          0
Temp3pm          0
RainTomorrow     0
dtype: int64

In [24]:
# data["Sunshine"].median()
data.drop(columns=["row ID","RainToday"],inplace=True)
data.head()

KeyError: "['row ID', 'RainToday'] not found in axis"

In [30]:
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split

categorical_columns = data.select_dtypes(include=['object']).columns
label_encoders = {}
for col in categorical_columns:
    le = LabelEncoder()
    data[col] = le.fit_transform(data[col])
    label_encoders[col] = le

# data["RainTomorrow"]

numerical_columns = data.select_dtypes(include=['float64', 'int64']).columns[:-1]
scaler = StandardScaler()
data[numerical_columns] = scaler.fit_transform(data[numerical_columns])

X = data.drop('RainTomorrow', axis=1)
y = data['RainTomorrow']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [31]:
X_train.shape

(65907, 20)

In [32]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

from tensorflow.keras.utils import to_categorical

# Assuming y_train has labels 0 and 1
y_train_one_hot = to_categorical(y_train)
y_test_one_hot = to_categorical(y_test)

model = Sequential([
    Dense(100,input_shape=(X_train.shape[1],),activation="relu"),
    Dense(100,activation="relu"),
    Dense(2, activation="sigmoid")
])
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])

model.fit(X_train, y_train_one_hot,batch_size=20,epochs=10,validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7e88a25c9390>

In [33]:
loss, accuracy = model.evaluate(X_test, y_test_one_hot)
print(f"Test loss: {loss}")
print(f"Test accuracy: {accuracy}")

Test loss: 0.3521389365196228
Test accuracy: 0.846149206161499


In [34]:
import numpy as np
y_pred_prob = model.predict(X_test)
y_pred = np.argmax(y_pred_prob,axis=1)



In [35]:
from sklearn.metrics import confusion_matrix,classification_report
confusion_matrix(y_test,y_pred)

array([[11873,   813],
       [ 1722,  2069]])

In [36]:
print("Classification Report:")
print(classification_report(y_test, y_pred))

Classification Report:
              precision    recall  f1-score   support

         0.0       0.87      0.94      0.90     12686
         1.0       0.72      0.55      0.62      3791

    accuracy                           0.85     16477
   macro avg       0.80      0.74      0.76     16477
weighted avg       0.84      0.85      0.84     16477

