# Nicholas Thomson
### Landmine Detection Course Project

The dataset was taken from UCI Machine Learning Repository at this link: https://archive.ics.uci.edu/dataset/763/land+mines-1

The dataset contains variables that a sensor would use to detect whether there is a landmine or not, and what type of landmine is present.

### Import Packages

In [1]:
!pip install ucimlrepo



In [2]:
from ucimlrepo import fetch_ucirepo 
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from torch.utils.data import TensorDataset, DataLoader

### Load the dataset

In [3]:
# fetch dataset 
land_mines = fetch_ucirepo(id=763) 
  
# data (as pandas dataframes) 
X = land_mines.data.features 
y = land_mines.data.targets

In [4]:
X.head()

Unnamed: 0,V,H,S
0,0.338157,0.0,0.0
1,0.320241,0.181818,0.0
2,0.287009,0.272727,0.0
3,0.256284,0.454545,0.0
4,0.26284,0.545455,0.0


In [5]:
X['S'].unique()

array([0. , 0.6, 0.2, 0.8, 0.4, 1. ])

In [6]:
print("Voltage(V) range: " + str(X['V'].unique().min()) + " to " + str(X['V'].unique().max()))
print("Height(H) range: " + str(X['H'].unique().min()) + " to " + str(X['H'].unique().max()))
print("Soil(S) range: " + str(X['S'].unique().min()) + " to " + str(X['S'].unique().max()))

Voltage(V) range: 0.197733879 to 0.999998728
Height(H) range: 0.0 to 1.0
Soil(S) range: 0.0 to 1.0


The voltage, height, and soil variables all are values between 0 and 1. This means the dataset is already normalized.

V = voltage: output voltage value of FLC sensor due to magnetic distortion

H = high: the height of the sensor from the ground

S = soil type: 6 different soil types depending on the moisture condition [dry and sandy, dry and humus, dry and limy, humid and sandy, humid and humus, humid and limy
- 0.0 = Dry and Sandy
- 0.2 = Dry and Humus
- 0.4 = Dry and Limy
- 0.6 = Humid and Sandy
- 0.8 = Humid and Humus
- 1.0 = Humid and Limy

In [7]:
y.head()

Unnamed: 0,M
0,1
1,1
2,1
3,1
4,1


In [8]:
print(y['M'].unique())

[1 2 3 4 5]


M = mine type: mine types commonly encountered on land (5 different mine classes)
- 1 = Null
- 2 = Anti-Tank
- 3 = Anti-personnel
- 4 = Booby-trapped Anti-personnel
- 5 = M14 Anti-personnel

There is an issue with how the mines are labeled. They should be labeled starting from 0, but they are labeled from 1. The PyTorch neural network requires the labels start from 0.

In [9]:
y = y.replace([1, 2, 3, 4, 5], [0, 1, 2, 3, 4])
print(y['M'].unique())

[0 1 2 3 4]


In [10]:
y.shape

(338, 1)

In [11]:
y = y.squeeze() # So the data is the appropriate shape
y.shape

(338,)

In [12]:
# Use oversampling to address class inbalance issue

from imblearn.over_sampling import RandomOverSampler

oversampler = RandomOverSampler(random_state=42)
X, y = oversampler.fit_resample(X, y)

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Standardize the features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# After standardized, X_train and X_test will be converted from pandas data frame into numpy arrays. To make the type of X data the same as y data (pandas series), we convert X data back to pandas data frame.
# The purpose of this step is to make the code syntax to convert X and y data to PyTorch tensors to be consistent.

X_train = pd.DataFrame(X_train) 
X_test = pd.DataFrame(X_test) 

In [13]:
# Convert the data to PyTorch tensors
X_train_tensor = torch.tensor(X_train.values, dtype=torch.float32) # X_train.values will convert X_train from a pandas dataframe into an numpy array, which is required as the input type for torch.tensor().
y_train_tensor = torch.tensor(y_train.values, dtype=torch.long) 
X_test_tensor = torch.tensor(X_test.values, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test.values, dtype=torch.long) 

print(X_train_tensor.shape, y_train_tensor.shape, X_test_tensor.shape, y_test_tensor.shape)

torch.Size([284, 3]) torch.Size([284]) torch.Size([71, 3]) torch.Size([71])


In [14]:
# Create TensorDataset objects for train and test data
train_dataset = TensorDataset(X_train_tensor, y_train_tensor)
test_dataset = TensorDataset(X_test_tensor, y_test_tensor)

# Create DataLoader objects for train and test datasets
train_dataloader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=64, shuffle=False)

### Define the neural network

In [15]:
# Define the neural network architecture
class NeuralNetwork(nn.Module):
    def __init__(self, input_size, hidden_size, num_classes):
        super(NeuralNetwork, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(hidden_size, num_classes)
    
    def forward(self, x):
        out = self.fc1(x)
        out = self.relu(out)
        out = self.fc2(out)
        return out

In [16]:
# Define hyperparameters

input_size = X_train_tensor.shape[1]
hidden_size = X_train_tensor.shape[1] * 2 # The size of hidden layer is arbitrarily chosen and can be tuned.
num_classes = 5 # Number of classes in your multi-class classification problem

# Instantiate the neural network model
model = NeuralNetwork(input_size, hidden_size, num_classes)

In [17]:
criterion = nn.CrossEntropyLoss() # CrossEntroyLoss 
optimizer = optim.Adam(model.parameters(), lr=0.001)

### Train the neural network

In [18]:
# Training loop
num_epochs = 200
for epoch in range(num_epochs):
    for inputs, labels in train_dataloader:
        # Forward pass
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        
        # Backward pass and optimization
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        # Print progress
        if (epoch+1) % 10 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')

print()

# Evaluate the model's accuracy on the training set
with torch.no_grad():
    correct = 0
    total = 0
    for inputs, labels in train_dataloader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    accuracy = correct / total
    print('Accuracy:', accuracy)
    

Epoch [10/200], Loss: 1.6632
Epoch [10/200], Loss: 1.5765
Epoch [10/200], Loss: 1.6028
Epoch [10/200], Loss: 1.5990
Epoch [10/200], Loss: 1.6271
Epoch [20/200], Loss: 1.5369
Epoch [20/200], Loss: 1.6150
Epoch [20/200], Loss: 1.6003
Epoch [20/200], Loss: 1.5865
Epoch [20/200], Loss: 1.5254
Epoch [30/200], Loss: 1.5335
Epoch [30/200], Loss: 1.5687
Epoch [30/200], Loss: 1.5538
Epoch [30/200], Loss: 1.5376
Epoch [30/200], Loss: 1.5233
Epoch [40/200], Loss: 1.5414
Epoch [40/200], Loss: 1.4832
Epoch [40/200], Loss: 1.4692
Epoch [40/200], Loss: 1.5413
Epoch [40/200], Loss: 1.5357
Epoch [50/200], Loss: 1.4710
Epoch [50/200], Loss: 1.4901
Epoch [50/200], Loss: 1.4776
Epoch [50/200], Loss: 1.4451
Epoch [50/200], Loss: 1.4916
Epoch [60/200], Loss: 1.4981
Epoch [60/200], Loss: 1.4192
Epoch [60/200], Loss: 1.4167
Epoch [60/200], Loss: 1.4209
Epoch [60/200], Loss: 1.3854
Epoch [70/200], Loss: 1.4207
Epoch [70/200], Loss: 1.3961
Epoch [70/200], Loss: 1.3271
Epoch [70/200], Loss: 1.4470
Epoch [70/200]

### Evaluate the neural network

In [19]:
# Evaluate the model's accuracy on the testing set
with torch.no_grad():
    correct = 0
    total = 0
    for inputs, labels in test_dataloader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

    accuracy = correct / total
    print('Accuracy:', accuracy)

Accuracy: 0.43661971830985913


The mine detection and classification using the neural network is able to accurately identify the type of mine about 43% of the time