<a href="https://colab.research.google.com/github/ashleybrea/05-AIT-HW/blob/main/04_DeepLearning_HW_AshleyBrea.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Objectives:
Design and implement an MLP using Keras that incorporates both residual and additional skip connections. Your model will be trained to perfectly overfit a single batch (batch size = 128) from a large dataset while performing poorly on validation data. Additionally, you will visualize your network architecture using the Netron app and include the exported diagram. The final submission must be uploaded to GitHub, and the submission text must start with the GitHub links.

# Dataset Prep & Processing


Use a large dataset such as the UCI Covertype Dataset, but you can use your own (e.g. from project work).

Preprocess the data by:

* Handling missing values.

*   Normalizing numerical features.
*   Encoding categorical variables.
*   Split the dataset into training and validation sets.



In [1]:
pip install ucimlrepo

Collecting ucimlrepo
  Downloading ucimlrepo-0.0.7-py3-none-any.whl.metadata (5.5 kB)
Downloading ucimlrepo-0.0.7-py3-none-any.whl (8.0 kB)
Installing collected packages: ucimlrepo
Successfully installed ucimlrepo-0.0.7


In [2]:
from ucimlrepo import fetch_ucirepo
import pandas as pd

# fetch dataset
covertype = fetch_ucirepo(id=31)

# data (as pandas dataframes)
X = covertype.data.features
y = covertype.data.targets

# metadata
print(covertype.metadata)

# variable information
print(covertype.variables)

{'uci_id': 31, 'name': 'Covertype', 'repository_url': 'https://archive.ics.uci.edu/dataset/31/covertype', 'data_url': 'https://archive.ics.uci.edu/static/public/31/data.csv', 'abstract': 'Classification of pixels into 7 forest cover types based on attributes such as elevation, aspect, slope, hillshade, soil-type, and more.', 'area': 'Biology', 'tasks': ['Classification'], 'characteristics': ['Multivariate'], 'num_instances': 581012, 'num_features': 54, 'feature_types': ['Categorical', 'Integer'], 'demographics': [], 'target_col': ['Cover_Type'], 'index_col': None, 'has_missing_values': 'no', 'missing_values_symbol': None, 'year_of_dataset_creation': 1998, 'last_updated': 'Sat Mar 16 2024', 'dataset_doi': '10.24432/C50K5N', 'creators': ['Jock Blackard'], 'intro_paper': None, 'additional_info': {'summary': 'Predicting forest cover type from cartographic variables only (no remotely sensed data).  The actual forest cover type for a given observation (30 x 30 meter cell) was determined from

In [4]:
# HANDLING MISSING VALUES - the website states that there are no missing values

# Split the dataset
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.25, random_state=42, shuffle=True)

# Normalizing Numerical Values - only our X in this case
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
X = scaler.fit_transform(X)

# Encoding Categorical Variables - our target
from tensorflow.keras.utils import to_categorical
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
y_train["Cover_Type"] = to_categorical(y_train["Cover_Type"]-1, num_classes=7)
y_val["Cover_Type"] = to_categorical(y_val["Cover_Type"]-1, num_classes=7)


# Model Architecture

*keep the number of trainable parameters as low as possible. Define the following neural network:

**Initial Layers**: Build an MLP in Keras to process the input features.

**Custom Residual Block**:

*   Using the Keras Functional API, create a block with at least two Dense layers with ReLU activations.
*   Implement a residual connection by adding the block’s input to its output (apply a linear projection with an extra Dense layer if the dimensions differ).

**Additional Skip Connection:**

* Implement an extra skip connection that bypasses one or more intermediate layers outside the residual block.

**Final Layers:**
* Add further Dense layers.
* Include an output layer appropriate for the task (e.g., a single unit with sigmoid activation for binary classification).*

In [5]:
from ast import mod
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input, Add

from keras import layers
from keras import activations

# creating custom residual block with residual connection
# we have 54 features, e.g. inputs
inputs = Input(shape=(54,))

layer1 = Dense(7, activation='relu')(inputs)
layer2 = Dense(7, activation='relu')(layer1)
layer3 = Dense(7, activation='relu')(layer2)
layer4 = Dense(7, activation='relu')(layer3)

residual_connection = Add()([layer1, layer4])

layer5 = Dense(7, activation='relu')(residual_connection)

# additional skip connection and final layers
skip_connection = Add()([layer5, layer1])

layer6 = Dense(7, activation='relu')(skip_connection)

layer6 = Dense(7, activation='relu')(layer6)

# output layer
output = Dense(7, activation='softmax')(layer6)
model = Model(inputs=inputs, outputs=output)
model.summary()
model.save('Assmt05_ashley_brea_overfitting.h5')




# Visualization


* Save your complete model (e.g., as a .h5 file or in JSON format).
* Open the saved model in the Netron app (https://netron.app/) and export the network diagram as an image.
* Ensure that the exported image clearly shows all parts of your architecture, including both residual and skip connections.



# Training & Evaluation



**Overfitting Experiment:**
*   Select a single batch of 128 samples from the training set.
* Train your model exclusively on this batch until you approach 0 loss.

**Validation Check:**
* Evaluate the overfitted model on the validation set to confirm that it performs poorly, demonstrating a lack of generalization.

**Conclusions:**
* At the end of your code, print the following information:
* Number of parameters:
* Final training loss:
* Final validation loss:



In [7]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [8]:
# 128 samples
x_sample, y_sample = X_train[:128], y_train[:128]

model_train = model.fit(x_sample, y_sample,
                        validation_data=(X_val, y_val),
                        epochs=1000,
                        batch_size=128,
                        verbose=2)

Epoch 1/1000
1/1 - 5s - 5s/step - accuracy: 0.1016 - loss: 1.7671 - val_accuracy: 0.0965 - val_loss: 1.7409
Epoch 2/1000
1/1 - 3s - 3s/step - accuracy: 0.1094 - loss: 1.7577 - val_accuracy: 0.1213 - val_loss: 1.7314
Epoch 3/1000
1/1 - 5s - 5s/step - accuracy: 0.1172 - loss: 1.7483 - val_accuracy: 0.1289 - val_loss: 1.7217
Epoch 4/1000
1/1 - 3s - 3s/step - accuracy: 0.1250 - loss: 1.7388 - val_accuracy: 0.1366 - val_loss: 1.7120
Epoch 5/1000
1/1 - 3s - 3s/step - accuracy: 0.1406 - loss: 1.7293 - val_accuracy: 0.1451 - val_loss: 1.7023
Epoch 6/1000
1/1 - 5s - 5s/step - accuracy: 0.1484 - loss: 1.7198 - val_accuracy: 0.1628 - val_loss: 1.6925
Epoch 7/1000
1/1 - 5s - 5s/step - accuracy: 0.1719 - loss: 1.7102 - val_accuracy: 0.1803 - val_loss: 1.6827
Epoch 8/1000
1/1 - 3s - 3s/step - accuracy: 0.1797 - loss: 1.7006 - val_accuracy: 0.1896 - val_loss: 1.6729
Epoch 9/1000
1/1 - 5s - 5s/step - accuracy: 0.1797 - loss: 1.6910 - val_accuracy: 0.1977 - val_loss: 1.6630
Epoch 10/1000
1/1 - 5s - 5s/

In [13]:

preds = model(X_val)
crt = 0

for pd, label in zip(preds, y_val):
    if np.argmax(pd) == np.argmax(label):
        crt += 1

print("correct num:", crt)
print("validation accuracy", crt/len(y_val))


print("number correct:", correct)
print("validation accuracy:", correct/len(y_val))

number correct: 0
validation accuracy: 0.0


# *Conclusion*

Number of Parameters: 777

Final Training Loss: 0.0606

Final Validation Loss: 5.7411