# Lab 7: Deep Learning for Multiclass Classification
## Multiclass Classification for Covertype Dataset
This notebook demonstrates a step-by-step implementation of a neural network for multiclass classification using the Covertype dataset. The goal is to classify forest cover types based on input features.

In [10]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

## Step 1: Load and Preprocess the Data
We load the Covertype dataset, split it into training and test sets, standardize the features, and one-hot encode the target labels for multiclass classification.

Classes: 7
<br/>Samples total: 581012
<br/>Features: 54
<br/>Features type:int

1. Elevation / quantitative /meters / Elevation in meters
2. Aspect / quantitative / azimuth / Aspect in degrees azimuth
3. Slope / quantitative / degrees / Slope in degrees
4. Horizontal_Distance_To_Hydrology / quantitative / meters / Horz Dist to nearest surface water features
5. Vertical_Distance_To_Hydrology / quantitative / meters / Vert Dist to nearest surface water features
6. Horizontal_Distance_To_Roadways / quantitative / meters / Horz Dist to nearest roadway
7. Hillshade_9am / quantitative / 0 to 255 index / Hillshade index at 9am, summer solstice
8. Hillshade_Noon / quantitative / 0 to 255 index / Hillshade index at noon, summer soltice
9. Hillshade_3pm / quantitative / 0 to 255 index / Hillshade index at 3pm, summer solstice
10. Horizontal_Distance_To_Fire_Points / quantitative / meters / Horz Dist to nearest wildfire ignition points
11. Wilderness_Area (4 binary columns) / qualitative / 0 (absence) or 1 (presence) / Wilderness area designation
12. Soil_Type (40 binary columns) / qualitative / 0 (absence) or 1 (presence) / Soil Type designation
13. Cover_Type (7 types) / integer / 1 to 7 / Forest Cover Type designation
<br/>Target: Cover_Type

In [11]:
# Load the Covertype dataset to a pandas DataFrame(df)
# Separate the features(X) and labels(y)
df = pd.read_csv('cover_dataset.csv')
X = df.drop(['target'],axis=1)
y = df['target']

# -------------------------------
# Your code here
# -------------------------------

# Convert labels to integers (from 1-7 to 0-6 for classification)
y = y - 1
# -------------------------------
# Your code here
# -------------------------------


In [12]:
# Show the first 5 rows of the dataset
df.head()

Unnamed: 0,Elevation,Aspect,Slope,Horizontal_Distance_To_Hydrology,Vertical_Distance_To_Hydrology,Horizontal_Distance_To_Roadways,Hillshade_9am,Hillshade_Noon,Hillshade_3pm,Horizontal_Distance_To_Fire_Points,...,Soil_Type32,Soil_Type33,Soil_Type34,Soil_Type35,Soil_Type36,Soil_Type37,Soil_Type38,Soil_Type39,Soil_Type40,target
0,0.368684,0.141667,0.045455,0.184681,0.223514,0.071659,0.870079,0.913386,0.582677,0.875366,...,0,0,0,0,0,0,0,0,0,5
1,0.365683,0.155556,0.030303,0.151754,0.215762,0.054798,0.866142,0.925197,0.594488,0.867838,...,0,0,0,0,0,0,0,0,0,5
2,0.472736,0.386111,0.136364,0.19184,0.307494,0.446817,0.92126,0.937008,0.531496,0.853339,...,0,0,0,0,0,0,0,0,0,2
3,0.463232,0.430556,0.272727,0.173228,0.375969,0.434172,0.937008,0.937008,0.480315,0.865886,...,0,0,0,0,0,0,0,0,0,2
4,0.368184,0.125,0.030303,0.10952,0.222222,0.054939,0.866142,0.92126,0.590551,0.860449,...,0,0,0,0,0,0,0,0,0,5


## Split and Scale the Dataset
We split the dataset into training and test sets with a test size of 20% and scale the features using the StandardScaler.

In [13]:
# Split into training and test sets using 80% training and 20% testing
# Set random_state to 42 for reproducibility
# Use train_test_split function from scikit-learn
# -------------------------------
# Your code here
# -------------------------------
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the feature data
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


## Encode the Target Labels
We one-hot encode the target labels using the to_categorical function from the keras.utils module.

In [14]:
# One-hot encode the labels using to_categorical function from Keras
y_train = to_categorical(y_train, num_classes=7)
y_test = to_categorical(y_test, num_classes=7)
# -------------------------------
# Your code here
# -------------------------------


## Step 2: Build the Model
We define a neural network with multiple dense layers and dropout layers to prevent overfitting. The output layer uses a softmax activation for multiclass classification.

In [15]:
# Define the model
# Complete the code to build a Sequential model
# The model should have 4 Dense layers with 256, 128, 64, and 7 units
# Use 'relu' activation function for the first 3 layers and 'softmax' for the last layer
# Add Dropout layers with 0.3 dropout rate after the first 2 Dense layers

model = Sequential([
    Dense(256, activation='relu', input_shape=(X_train.shape[1],)),
    Dropout(0.3),
    Dense(128, activation='relu'),
    Dropout(0.3),
    Dense(64, activation='relu'),
    Dense(7, activation='softmax')
])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Step 3: Compile the Model
We compile the model using the Adam optimizer with a learning rate of 0.001, categorical crossentropy loss, and accuracy as the evaluation metric.

In [16]:
# Compile the model
# Use Adam optimizer with learning rate of 0.001
optimizer = Adam(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

# -------------------------------
# Your code here
# -------------------------------

## Step 4: Train the Model
We train the model on the training data for 20 epochs, using a batch size of 64 and validating on the test set.

In [21]:
# Train the model
# Use 50 epochs and batch size of 32
# Use the training and test sets
# Save the training history to a variable(history)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20, batch_size=64)

# -------------------------------
# Your code here
# -------------------------------

Epoch 1/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 3ms/step - accuracy: 0.8537 - loss: 0.3561 - val_accuracy: 0.8896 - val_loss: 0.2762
Epoch 2/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 3ms/step - accuracy: 0.8538 - loss: 0.3556 - val_accuracy: 0.8907 - val_loss: 0.2770
Epoch 3/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 3ms/step - accuracy: 0.8546 - loss: 0.3538 - val_accuracy: 0.8923 - val_loss: 0.2738
Epoch 4/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 3ms/step - accuracy: 0.8556 - loss: 0.3533 - val_accuracy: 0.8910 - val_loss: 0.2736
Epoch 5/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 4ms/step - accuracy: 0.8562 - loss: 0.3512 - val_accuracy: 0.8926 - val_loss: 0.2697
Epoch 6/20
[1m7263/7263[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 3ms/step - accuracy: 0.8568 - loss: 0.3490 - val_accuracy: 0.8927 - val_loss: 0.2694
Epoch 7/20

## Step 5: Evaluate the Model
We evaluate the model on the test set and print the test accuracy.

In [18]:
# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test, verbose=2)
print(f"Test accuracy: {test_accuracy:.2f}")


3632/3632 - 5s - 1ms/step - accuracy: 0.8882 - loss: 0.2792
Test accuracy: 0.89


## Step 6: Make Predictions
We use the trained model to make predictions for the first 10 samples in the test set.

In [19]:
# Make predictions
# Use the first 10 test data points to make predictions(predictions)
predictions = model.predict(X_test[:10])

# -------------------------------
# Your code here
# -------------------------------

# Show the predicted probabilities
print('\n Predicted Probabilities:')
print(f"{'Class 0':<10}{'Class 1':<10}{'Class 2':<10}{'Class 3':<10}{'Class 4':<10}{'Class 5':<10}{'Class 6':<10}")
for pred_prob in predictions:
    print(f"{pred_prob[0]:<10.3}{pred_prob[1]:<10.3}{pred_prob[2]:<10.3}{pred_prob[3]:<10.3}{pred_prob[4]:<10.3}{pred_prob[5]:<10.3}{pred_prob[6]:<10.3}")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 98ms/step

 Predicted Probabilities:
Class 0   Class 1   Class 2   Class 3   Class 4   Class 5   Class 6   
0.768     0.00025   1.14e-15  5.06e-21  4.64e-11  6.44e-16  0.232     
0.169     0.789     0.00428   0.000147  0.011     0.0117    0.0152    
0.00102   0.99      0.00201   6.45e-10  0.00482   0.00232   5.15e-09  
0.236     0.764     9.32e-11  3.74e-12  2.14e-06  2.14e-08  9.63e-08  
0.00606   0.799     0.000636  3.4e-06   0.194     0.000163  3.2e-05   
5.3e-11   4.4e-06   0.999     1.51e-06  2.6e-10   0.000554  3.64e-23  
0.0508    0.949     5.7e-08   1.01e-09  1.44e-06  8.69e-09  1.69e-05  
0.993     0.00176   4.18e-13  2.27e-18  3.06e-08  4.63e-12  0.00569   
0.237     0.763     3.14e-08  2.4e-11   1.06e-05  1.47e-06  7.81e-05  
0.0174    0.978     1.29e-05  2.07e-08  0.00427   0.000185  2.66e-06  


## Step 7: Interpret the Predictions
We convert the predicted probabilities into class labels using `np.argmax`, and compare them with the true labels.

In [20]:
# Interpret the predictions
# Use np.argmax() to get the predicted class labels from the predicted probabilities
predicted_labels = np.argmax(predictions, axis=1)
true_labels = np.argmax(y_test[:10], axis=1)

# -------------------------------
# Your code here
# -------------------------------


# Show the predicted and true labels
print("Predicted labels:", predicted_labels)
print("True labels:", true_labels)


Predicted labels: [0 1 1 1 1 2 1 0 1 1]
True labels: [0 1 1 1 1 2 1 0 1 1]
