# Lab 7: Deep Learning for Multiclass Classification
## Multiclass Classification for Covertype Dataset
This notebook demonstrates a step-by-step implementation of a neural network for multiclass classification using the Covertype dataset. The goal is to classify forest cover types based on input features.

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

2025-01-08 14:05:03.321085: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-01-08 14:05:03.328755: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-01-08 14:05:03.337755: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-01-08 14:05:03.340357: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-08 14:05:03.347412: I tensorflow/core/platform/cpu_feature_guar

## Step 1: Load and Preprocess the Data
We load the Covertype dataset, split it into training and test sets, standardize the features, and one-hot encode the target labels for multiclass classification.

Classes: 7
<br/>Samples total: 581012
<br/>Features: 54
<br/>Features type:int

1. Elevation / quantitative /meters / Elevation in meters
2. Aspect / quantitative / azimuth / Aspect in degrees azimuth
3. Slope / quantitative / degrees / Slope in degrees
4. Horizontal_Distance_To_Hydrology / quantitative / meters / Horz Dist to nearest surface water features
5. Vertical_Distance_To_Hydrology / quantitative / meters / Vert Dist to nearest surface water features
6. Horizontal_Distance_To_Roadways / quantitative / meters / Horz Dist to nearest roadway
7. Hillshade_9am / quantitative / 0 to 255 index / Hillshade index at 9am, summer solstice
8. Hillshade_Noon / quantitative / 0 to 255 index / Hillshade index at noon, summer soltice
9. Hillshade_3pm / quantitative / 0 to 255 index / Hillshade index at 3pm, summer solstice
10. Horizontal_Distance_To_Fire_Points / quantitative / meters / Horz Dist to nearest wildfire ignition points
11. Wilderness_Area (4 binary columns) / qualitative / 0 (absence) or 1 (presence) / Wilderness area designation
12. Soil_Type (40 binary columns) / qualitative / 0 (absence) or 1 (presence) / Soil Type designation
13. Cover_Type (7 types) / integer / 1 to 7 / Forest Cover Type designation
<br/>Target: Cover_Type

In [2]:
# Load the Covertype dataset to a pandas DataFrame(df)
# Separate the features(X) and labels(y)

# -------------------------------
# Your code here
# -------------------------------

# Convert labels to integers (from 1-7 to 0-6 for classification)
# -------------------------------
# Your code here
# -------------------------------

In [3]:
# Show the first 5 rows of the dataset
df.head()

Unnamed: 0,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,Horizontal_Distance_To_Fire_Points,...,Soil_Type32,Soil_Type33,Soil_Type34,Soil_Type35,Soil_Type36,Soil_Type37,Soil_Type38,Soil_Type39,Soil_Type40,target
0,0.368684,0.141667,0.045455,0.184681,0.223514,0.071659,0.870079,0.913386,0.582677,0.875366,...,0,0,0,0,0,0,0,0,0,5
1,0.365683,0.155556,0.030303,0.151754,0.215762,0.054798,0.866142,0.925197,0.594488,0.867838,...,0,0,0,0,0,0,0,0,0,5
2,0.472736,0.386111,0.136364,0.19184,0.307494,0.446817,0.92126,0.937008,0.531496,0.853339,...,0,0,0,0,0,0,0,0,0,2
3,0.463232,0.430556,0.272727,0.173228,0.375969,0.434172,0.937008,0.937008,0.480315,0.865886,...,0,0,0,0,0,0,0,0,0,2
4,0.368184,0.125,0.030303,0.10952,0.222222,0.054939,0.866142,0.92126,0.590551,0.860449,...,0,0,0,0,0,0,0,0,0,5


## Split and Scale the Dataset
We split the dataset into training and test sets with a test size of 20% and scale the features using the StandardScaler.

In [4]:
# Split into training and test sets using 80% training and 20% testing
# Set random_state to 42 for reproducibility
# Use train_test_split function from scikit-learn
# -------------------------------
# Your code here
# -------------------------------

# Standardize the feature data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


## Encode the Target Labels
We one-hot encode the target labels using the to_categorical function from the keras.utils module.

In [5]:
# One-hot encode the labels using to_categorical function from Keras
# -------------------------------
# Your code here
# -------------------------------


## Step 2: Build the Model
We define a neural network with multiple dense layers and dropout layers to prevent overfitting. The output layer uses a softmax activation for multiclass classification.

In [6]:
# Define the model
# Complete the code to build a Sequential model
# The model should have 4 Dense layers with 256, 128, 64, and 7 units
# Use 'relu' activation function for the first 3 layers and 'softmax' for the last layer
# Add Dropout layers with 0.3 dropout rate after the first 2 Dense layers

model = Sequential([
    Dense(..., activation=..., input_shape=(...,)),
    Dropout(...),
    Dense(..., activation=...),
    Dropout(...),
    Dense(..., activation=...),
    Dense(..., activation=...)
])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
I0000 00:00:1736319905.031172 3364130 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1736319905.051923 3364130 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1736319905.052065 3364130 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1736319905.05277

## Step 3: Compile the Model
We compile the model using the Adam optimizer with a learning rate of 0.001, categorical crossentropy loss, and accuracy as the evaluation metric.

In [7]:
# Compile the model
# Use Adam optimizer with learning rate of 0.001

# -------------------------------
# Your code here
# -------------------------------

## Step 4: Train the Model
We train the model on the training data for 20 epochs, using a batch size of 64 and validating on the test set.

In [8]:
# Train the model
# Use 50 epochs and batch size of 32
# Use the training and test sets
# Save the training history to a variable(history)

# -------------------------------
# Your code here
# -------------------------------

Epoch 1/20


I0000 00:00:1736319906.140903 3364227 service.cc:146] XLA service 0x72439c01d830 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1736319906.140918 3364227 service.cc:154]   StreamExecutor device (0): NVIDIA GeForce RTX 3090, Compute Capability 8.6
2025-01-08 14:05:06.155596: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-01-08 14:05:06.228735: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 8907
2025-01-08 14:05:06.307893: W external/local_xla/xla/service/gpu/nvptx_compiler.cc:762] The NVIDIA driver's CUDA version is 12.2 which is older than the ptxas CUDA version (12.3.107). Because the driver is older than the ptxas version, XLA is disabling parallel compilation, which may slow down compilation. You should update your NVIDIA driver or use the NVIDIA-provided CUDA forward compatibility

[1m  319/14526[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m6s[0m 475us/step - accuracy: 0.5911 - loss: 1.0363

I0000 00:00:1736319907.102871 3364227 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 675us/step - accuracy: 0.7247 - loss: 0.6464 - val_accuracy: 0.8049 - val_loss: 0.4654
Epoch 2/20
[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 526us/step - accuracy: 0.7869 - loss: 0.5038 - val_accuracy: 0.8251 - val_loss: 0.4211
Epoch 3/20
[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 538us/step - accuracy: 0.8038 - loss: 0.4644 - val_accuracy: 0.8383 - val_loss: 0.3933
Epoch 4/20
[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 532us/step - accuracy: 0.8142 - loss: 0.4440 - val_accuracy: 0.8482 - val_loss: 0.3681
Epoch 5/20
[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 528us/step - accuracy: 0.8191 - loss: 0.4315 - val_accuracy: 0.8500 - val_loss: 0.3640
Epoch 6/20
[1m14526/14526[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 521us/step - accuracy: 0.8234 - loss: 0.4228 - val_accuracy: 0.8601 - val_loss: 0.3477
Ep

## Step 5: Evaluate the Model
We evaluate the model on the test set and print the test accuracy.

In [9]:
# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=2)
print(f"Test accuracy: {test_accuracy:.2f}")


3632/3632 - 1s - 311us/step - accuracy: 0.8841 - loss: 0.2920
Test accuracy: 0.88


## Step 6: Make Predictions
We use the trained model to make predictions for the first 10 samples in the test set.

In [10]:
# Make predictions
# Use the first 10 test data points to make predictions(predictions)

# -------------------------------
# Your code here
# -------------------------------

# Show the predicted probabilities
print('\n Predicted Probabilities:')
print(f"{'Class 0':<10}{'Class 1':<10}{'Class 2':<10}{'Class 3':<10}{'Class 4':<10}{'Class 5':<10}{'Class 6':<10}")
for pred_prob in predictions:
    print(f"{pred_prob[0]:<10.3}{pred_prob[1]:<10.3}{pred_prob[2]:<10.3}{pred_prob[3]:<10.3}{pred_prob[4]:<10.3}{pred_prob[5]:<10.3}{pred_prob[6]:<10.3}")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 487ms/step

 Predicted Probabilities:
Class 0   Class 1   Class 2   Class 3   Class 4   Class 5   Class 6   
0.907     0.00306   4.13e-17  3.96e-22  4.69e-06  2.53e-18  0.0896    
0.21      0.774     0.00272   0.000152  0.00491   0.00626   0.00257   
0.00751   0.906     0.00867   5.33e-06  0.0249    0.0532    1.34e-05  
0.0434    0.957     2.68e-13  4.85e-15  1.78e-06  9.85e-11  8.59e-08  
4.71e-06  0.908     3.09e-05  2.79e-24  0.0919    4.15e-07  3.02e-08  
4e-15     3.52e-06  0.995     2.93e-06  5.7e-10   0.00504   8.72e-34  
0.0626    0.937     6.56e-17  4.4e-20   1.03e-07  2.68e-13  2.97e-08  
0.994     0.00333   3.43e-13  1.22e-14  7.85e-06  2.08e-13  0.00241   
0.0888    0.911     1.09e-12  4.05e-22  7.03e-06  6.09e-11  1.63e-05  
0.00226   0.991     1.28e-07  2.02e-09  0.00711   1.57e-05  1.88e-08  


## Step 7: Interpret the Predictions
We convert the predicted probabilities into class labels using `np.argmax`, and compare them with the true labels.

In [11]:
# Interpret the predictions
# Use np.argmax() to get the predicted class labels from the predicted probabilities

# -------------------------------
# Your code here
# -------------------------------


# Show the predicted and true labels
print("Predicted labels:", predicted_labels)
print("True labels:", true_labels)


Predicted labels: [0 1 1 1 1 2 1 0 1 1]
True labels: [0 1 1 1 1 2 1 0 1 1]
