# DEEPLEARNING ASSIGNMENT-013

1. Why is it generally preferable to use a Logistic Regression classifier rather than a classical
Perceptron (i.e., a single layer of linear threshold units trained using the Perceptron training
algorithm)? How can you tweak a Perceptron to make it equivalent to a Logistic Regression
classifier?
2. Why was the logistic activation function a key ingredient in training the first MLPs?
3. Name three popular activation functions. Can you draw them?
4. Suppose you have an MLP composed of one input layer with 10 passthrough neurons,
followed by one hidden layer with 50 artificial neurons, and finally one output layer with 3
artificial neurons. All artificial neurons use the ReLU activation function.
 What is the shape of the input matrix X?
 What about the shape of the hidden layer’s weight vector Wh, and the shape of its
bias vector bh?
 What is the shape of the output layer’s weight vector Wo, and its bias vector bo?
 What is the shape of the network’s output matrix Y?
 Write the equation that computes the network’s output matrix Y as a function
of X, Wh, bh, Wo and bo.

5. How many neurons do you need in the output layer if you want to classify email into spam
or ham? What activation function should you use in the output layer? If instead you want to
tackle MNIST, how many neurons do you need in the output layer, using what activation
function?
6. What is backpropagation and how does it work? What is the difference between
backpropagation and reverse-mode autodiff?
7. Can you list all the hyperparameters you can tweak in an MLP? If the MLP overfits the
training data, how could you tweak these hyperparameters to try to solve the problem?
8. Train a deep MLP on the MNIST dataset and see if you can get over 98% precision. Try
adding all the bells and whistles (i.e., save checkpoints, restore the last checkpoint in case of
an interruption, add summaries, plot learning curves using TensorBoard, and so on).

8A)Here's an example code snippet to train a deep MLP on the MNIST dataset using TensorFlow, with the bells and whistles included:

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess the data
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
y_train = keras.utils.to_categorical(y_train, 10)
y_test = keras.utils.to_categorical(y_test, 10)

# Define the model architecture
model = Sequential()
model.add(Dense(512, activation='relu', input_shape=(784,)))
model.add(Dropout(0.2))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(loss='categorical_crossentropy',
              optimizer=RMSprop(),
              metrics=['accuracy'])

# Set up callbacks
checkpoint_path = "model_checkpoint.h5"
model_checkpoint = ModelCheckpoint(checkpoint_path, save_best_only=True)
tensorboard_callback = TensorBoard(log_dir="./logs", histogram_freq=1)

# Train the model
history = model.fit(x_train, y_train,
                    batch_size=128,
                    epochs=20,
                    verbose=1,
                    validation_data=(x_test, y_test),
                    callbacks=[model_checkpoint, tensorboard_callback])

# Evaluate the model
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])


This code defines a deep MLP with two hidden layers of 512 neurons each, ReLU activation functions, and dropout regularization. It trains the model for 20 epochs using a batch size of 128, and uses the RMSprop optimizer with categorical cross-entropy loss. It also includes a model checkpoint callback to save the best-performing model, and a TensorBoard callback to log training summaries and plot learning curves. With this setup, the model should achieve over 98% precision on the MNIST test set.