This code demonstrates how to calculate the activated output of a neuron with a PReLU activation function and determine the gradients with respect to the weights and the learnable PReLU parameter. It's a simple example to illustrate the concept of PReLU and gradient computation in a neural network context.

In [1]:
import numpy as np

# Define the inputs, weights, and PReLU parameter
x1, x2, x3 = 1.0, 2.0, -1.0
w1, w2, w3 = 0.5, -1.0, 2.0
alpha = 0.2

# Calculate the weighted sum
z = w1 * x1 + w2 * x2 + w3 * x3

# Activation function with PReLU
def prelu(z, alpha):
    if z > 0:
        return z
    else:
        return alpha * z

# Calculate the activated output
a = prelu(z, alpha)

# Calculate the gradient with respect to weights
grad_w1 = 1 if z > 0 else alpha
grad_w2 = 1 if z > 0 else alpha
grad_w3 = 1 if z > 0 else alpha

# Calculate the gradient with respect to the PReLU parameter α
grad_alpha = min(0, z)

# Print the results
print("Activated Output (a):", a)
print("Gradient with respect to w1:", grad_w1)
print("Gradient with respect to w2:", grad_w2)
print("Gradient with respect to w3:", grad_w3)
print("Gradient with respect to alpha:", grad_alpha)


Activated Output (a): -0.7000000000000001
Gradient with respect to w1: 0.2
Gradient with respect to w2: 0.2
Gradient with respect to w3: 0.2
Gradient with respect to alpha: -3.5


Below is the code used for a Dense Neural Network utilized to classify the CIFAR-10 dataset.

This is a feedforward NN using SGD optimizer, MinMaxScaler for normalization, ReLU and Softmax for activation, Dropout for regularization and One-hot to encode the labels.
This code also employs an EarlyStopping callback when the code has determined it has found the best accuracy after a set number of Epochs.

I ran this code several times while adjusting the hyperperameters and have come to the conclustion that the below code has determined the best accuracy I could find within the alloted time period we were given.

There are most likely more adjustments I could make based on the information we have learned in the lectures and appylying the new techniques down the line.

In [8]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.losses import CategoricalCrossentropy
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.initializers import HeNormal, GlorotNormal
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.utils import plot_model

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import classification_report

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Split data into train, validation, and test sets
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.2, random_state=1)
x_test, x_val, y_test, y_val = train_test_split(x_val, y_val, test_size=0.2, random_state=1)

# Data normalization
# scaler = StandardScaler()
scaler = MinMaxScaler()
x_train = scaler.fit_transform(x_train.reshape(-1, 32*32*3))
x_val = scaler.transform(x_val.reshape(-1, 32*32*3))
x_test = scaler.transform(x_test.reshape(-1, 32*32*3))

# There are 10 classes in CIFAR-10
num_classes = 10

# One-hot encode the labels
y_train = to_categorical(y_train, num_classes)
y_val = to_categorical(y_val, num_classes)
y_test = to_categorical(y_test, num_classes)

# Define a function to create a neural network model
def create_model():
    model = Sequential()
    model.add(Dense(128, activation='relu', input_shape=(32*32*3,)))
    model.add(Dropout(0.2))  # Dropout for regularization
    model.add(Dense(64, activation='relu'))
    model.add(Dropout(0.2))  # Dropout for regularization
    model.add(Dense(10, activation='softmax'))  # Output layer for 10 classes
    return model

# Initialize SGD optimizer with a learning rate
sgd = SGD(learning_rate=0.1)

# Build the model
model = create_model()

# Modify the output layer
model.add(Dense(num_classes, activation='softmax'))  # Output layer for multi-class classification

# Compile the model
model.compile(optimizer=sgd, loss=CategoricalCrossentropy(), metrics=['accuracy'])

# Early stopping callback to prevent overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=10, verbose=1, restore_best_weights=True)

# print a summary of the model
model.summary()

# Train the model
maxEpoch = 100
H = model.fit(x_train, y_train, epochs=maxEpoch, batch_size=32, validation_data=(x_val, y_val), callbacks=[early_stopping])

# Evaluate the model on the test set
# Assuming you have one-hot encoded labels
# Convert predictedY and testY to integer labels
# Print the Training Set Accuracy
predictedY = np.argmax(model.predict(x_train), axis=1)
trainY = np.argmax(y_train, axis=1)

# Calculate classification report
print("Training set accuracy")
print(classification_report(trainY, predictedY))

# Print the Test Set Accuracy
predictedY = np.argmax(model.predict(x_test), axis=1)
testY = np.argmax(y_test, axis=1)

# Calculate classification report
print("Test set accuracy")
print(classification_report(testY, predictedY))

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_16 (Dense)            (None, 128)               393344    
                                                                 
 dropout_8 (Dropout)         (None, 128)               0         
                                                                 
 dense_17 (Dense)            (None, 64)                8256      
                                                                 
 dropout_9 (Dropout)         (None, 64)                0         
                                                                 
 dense_18 (Dense)            (None, 10)                650       
                                                                 
 dense_19 (Dense)            (None, 10)                110       
                                                                 
Total params: 402360 (1.53 MB)
Trainable params: 40236