# Introduction

This notebook will guide you through the process of collecting data, defining, training and evaluating a grass classification system based on Acconeer's A121 sensor. The intended use of the system is to be deployed on an autonomous lawn-mower, thus removing the need for a barrier that informs the system whether it should cut or not.

We will do some hands on machine learning and introduce the reader to some basic radar concepts. This is intended as a starting point for a project aimed at implementing a grass detection system for autonomous lawn-mowers.

For further reading about Acconeer's sensors, algorithms and more, please visit the documentation and developer site, found [here](https://docs.acconeer.com/en/latest/index.html).

# System

The platform used here to simulate the autonomous lawn-mower is the Kobuki (http://kobuki.yujinrobot.com/about2/) mounted with the Acconeer A121 sensor. The sensor is mounted at an height of 85 mm looking down at about 45 degrees angle towards the ground, see picture. This placement is dependent on the Kobuki geometry, and should be experimented with.

<figure>
    <img src="./doc/sensor_placement.jpg" alt="Sensor Placement" class="bg-primary mb-1" width="70%">
    <div class="caption" align="center">Sensor Placement</div>
</figure>

An initial formulation based on the intended use case is to separate the classification into two classes: "Grass" and "Other". The system will continously make decisions on which type of surface it believes it is travelling on. We assume that the system only needs to make a classification when moving, so all data collected will be using a system in movement.

# Data Collection and Sensor Settings
The data collection is done iteratively (collecting data, updating algorithm, evaluating and repeat). This type of iteration is typical for this kind of algorithm development project. We want to achieve a dataset that yields similar algorithms regardless of which files are used for training and testing, as well as remain stable when new data is added. If we run another data collecting session and the current "best" algorithm has issues classifying the data, it is an indication that the dataset needs to be extended.

The sensor settings used for the data collection is determined by measurements and experience, view these as a suggestion for a starting point for your own project. Because the settings used have a large impact on the performance, it is a good practice to re-evaluate the settings throughout the project. 

The starting point is set to 20 and the step length to 1. The number of points is set to 40, which yields an approximate range of 0.05 m to 0.147 m. The profile is set to 1, with the shortest possible pulse length. this minimizes the effect of the direct leakage and allows for the best possible resolution.

The sensor data is packaged into "frames" consisting of several sweeps over the entire range. In our case, we wish to have a high update rate paired with a high sampling frequency. To achieve this we use the "continuous sweep mode". We set the sweeps per frame to 1 and the sweep rate to 200 Hz. This yields 200 sweeps every second, continuously sampled. The full sensor settings can be seen here, in the form of a snippet of python code.

    config = a121.SensorConfig()
    config.sweep_rate = 200
    config.hwaas = 20
    config.sweeps_per_frame = 1
    config.start_point = 20
    config.step_length = 1
    config.num_points = 40
    config.profile = a121.Profile.PROFILE_1
    config.phase_enhancement = False
    config.continous_sweep_mode = True
    config.double_buffering = False
    config.enable_loopback = False
    config.receiver_gain = 16
    config.inter_sweep_idle_state=a121.IdleState.READY
    config.inter_frame_idle_state=a121.IdleState.READY
    config.prf = a121.PRF.PRF_19_5_MHz
    config.enable_tx = True

When collecting data, we let the Kobuki run in a straight line (roughly 20 m, but varies) while the sensor is running. The type of surface that the Kobuki is running on is noted in a protocol. This procedure is repeated for different surface types.

<table>
    <td>
        <figure>
            <img src="./doc/kobuki_gravel.jpg" alt="Kobuki running on gravel" class="bg-primary mb-1" width="80%">
            <div class="caption" align="center">Kobuki running on gravel</div>
        </figure>
    </td>
    <td> 
        <figure>
            <img src="./doc/kobuki_grass.jpg" alt="Kobuki running on grass" class="bg-primary mb-1" width="80%">
            <div class="caption" align="center">Kobuki running on grass</div>
        </figure>
    </td>
</table>

All the collected data as well as the protocol can be found in the supplied folder. The files are named with a number that is referenced in the protocol as well.

# Algorithm overview

The algorithm consists of three main parts:
- Preprocessing
- Neural network predictor
- Postprocessor classifier

The preprocessor consists mainly of creating a time window and reshaping the data to fit the neural network. But we can also put some feature extraction here.

The predictor is a standard trained neural network, which has trained on preprocessed data.

The postprocessor aggregates predictions from the network, and applies a majority vote for a time interval (roughly a second). This vote yields an output 200 times per second, this can either be "grass" or "other". Which is what we use as a classification.

# Evaluation

In order to compare and evaluate our different settings (preprocessing methods, network type, postprocessor thresholds and methods etc.), we need a measurement that is independent of our processing. I chose to use a standard True/False Positive rate and True/False Negative rate for the different cases. See more here: https://en.wikipedia.org/wiki/F-score

For our use case, we know what the system should classify, so we use these terms instead. We end up with 4 categories using the syntax of "classification,actual class": "Grass, Grass", "Grass, Other", "Other, Grass" and "Other, Other". For example "Grass,Other" (or just "GO") means that the system classified the sample as grass, but it was in reality something else.

As an additional check, we also compute the total accuracy of the system. Note that these measurements are not only measuring the quality of your model, but the dataset as well. If the dataset lacks a specific corner case or has too few/many entries from a certain category, this will also affect your scores. So a good score might still produce an algorithm that seemingly performs bad. If the dataset is incomplete, algorithm optimization will not mitigate the issue.

But assuming that the dataset has enough coverage of corner cases as well as the typical use cases, a high true positive rate and high false negative rate should produce a good algorithm for the use case. 


# Loading and preprocessing

The acconeer exploration tool contains code for loading and saving data, so the following sections assume that the exploration tool is installed at the system, more info can be found here: https://docs.acconeer.com/en/latest/exploration_tool/index.html

In addition, you need the following python packages installed:
 - numpy
 - pandas
 - tensorflow 
 - keras
 
(Depdending on version, the two last might be a single package)

Lets get into some code. First, we need some imports for loading the data as well as a few helper functions for loading. 

Here we also define the preprocessing, which iterates a rolling window over the time dimension. It also reshapes the data and separates the amplitude and the phase. However, as a design choice, we leave as much as possible of the feature selection to the neural network.


In [None]:
import acconeer.exptool.a121 as a121
import numpy as np
import pandas as pd

import collections

from tensorflow.keras.layers import Conv2D, BatchNormalization, Reshape, Dropout, Conv1D, Dense, Flatten
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
import tensorflow.keras.models

def load_data_from_ind(ind):
    path = "data/{}.h5".format(ind)
    record = a121.load_record(path)
    data = []
    for elm in record.extended_stacked_results:
        dframe = [elm[key].frame for key in elm.keys()]
        dframe = np.array(dframe)
        data.append(dframe)

    data = np.squeeze(np.array(data))
    return data

def get_label_from_ind(ind):
    protocol = pd.read_excel("data/protocol.ods", engine="odf")
    row = protocol.loc[protocol['id'] == ind]
    label = row['surface type'].item()
    return label

def preprocess_data(data):
    window_length = 100 #Can be adjusted, evaluation is encouraged.
    amp_data = abs(data)
    windows = np.lib.stride_tricks.sliding_window_view(amp_data, (window_length,amp_data.shape[1]))
    ret = np.reshape(windows, (windows.shape[0], window_length, amp_data.shape[1], 1))
    return ret

We continue with defining a training set (as a list of indexes) and defining a preprocessing procedure. Note that we subsample the data, this is a good practice when dealing with time series but it also reduces the size of the dataset to something more manageable. In addition to the subsampling, we also crop each file, this is due to the fact that the recording is started when the kobuki is still and stopped after the kobuki is stopped. So each file begins and ends with a non-movement segment, this will be ambigous and should be removed from training. The whole classification assumes that the kobuki is moving when running classification, 

In [None]:
training_indexes = [1,2,3,4,5,6,7,9,10,11,12,14,15,17,18,20,21,24,28,29,34,35,37,42,46,47]
evaluation_indexes = [8,13,16,19,22,23,30,31,36,43]

subsampling_rate = 10 #Even for small datasets, this should be larger than 1. Typically at least 5.

training_data = []
training_labels = []

trim_length = 200 #Number of samples to remove due the collection robot not moving.

for ind in training_indexes:
    data = load_data_from_ind(ind)
    if(trim_length > 0):
        data = data[trim_length:-trim_length]
    pre_data = preprocess_data(data)
    pre_data = pre_data[::subsampling_rate]
    training_data.append(pre_data)
    raw_label = get_label_from_ind(ind)
    if(raw_label != "grass"):
        label = 0.0
    else:
        label = 1.0
        
    labels = [label] * pre_data.shape[0]
    training_labels += labels

training_data = np.concatenate(training_data)
training_labels = np.array(training_labels)

print(training_data.shape)
print(training_labels.shape)

# Defining the neural net structure and training
The neural network used has a 2DConvolution layer as first layer, this is a good way to let the network figure out relevant features itself instead of manually designing them. When evaluating neural networks, it is considered good practice to train several different and compare evaluations to get a good feel for what works and what doesn't. It is very difficult to predict what will work well for a specific problem, since the model is very complex. Therefore, the network used here should not be considered as final, any implementation should experiment alot with the settings and structure.

In [None]:
model = Sequential()
model.add(Conv2D(kernel_size = (20,5), 
                 strides = (5,1), 
                 filters = 4, 
                 input_shape = training_data.shape[1:], 
                 activation = "relu"))
model.add(BatchNormalization())
model.add(Conv2D(kernel_size = (17,5), 
                 strides = (1,1), 
                 filters = 4, 
                 activation = "relu"))
model.add(Flatten())
model.add(BatchNormalization())
model.add(Dense(units = 20, activation = "relu"))
model.add(BatchNormalization())
model.add(Dense(units = 6, activation = "relu"))
model.add(BatchNormalization())
model.add(Dense(units = 1, activation = "sigmoid"))
model.compile(loss = 'binary_crossentropy', 
              optimizer = Adam(learning_rate = 0.001), 
              metrics = ['accuracy'])

#training settings
epochs = 12
batch_size = 200
model.fit(x = training_data, 
          y = training_labels, 
          batch_size = batch_size, 
          epochs = epochs, 
          shuffle = True)

model.save("out_model.h5")

#Free up some memory
del training_data
del training_labels


# Postprocessing

The output from the neural network is a number between 0-1, and is often interpreted as a probability. This is however a dangerous caveat, because it lends us to believe that an output value of 0.8 is very likely grass while an output value of 0.2 is likely to be something else. In reality, the best grass identification might arise from interpreting values above 0.95 as grass and below as something else.

Therefore, we introduce a threshold, in our case called "grass_threshold" which we can adjust to get the desired behavior. 

Another typical behavior of a neural network is that it might output "spikes" of erroneous classifications. We want to have some kind of systematic inertia to avoid "flickering" between classifications. In our case, we use a variation of a median filter to achieve this. We call this a "qualified majority filter". Once we have a classification, we require a fraction (typically 70%) of the previous classifications to differ in order to switch to the other class.  


In [None]:

def get_class(prediction):
    grass_threshold = 0.6 #This is subject to optimization
    if(prediction > grass_threshold):
        return "grass"
    return "other"

def postprocessing(predictions):
    buffer_length = 199 # We need majority for about 100 samples, this can be adjusted.
    qualified_majority = 0.85 #Fraction of samples to switch classification.
    buffer = collections.deque(maxlen = buffer_length)
    
    classifications = []
    
    for prediction in predictions:
        prel_class = get_class(prediction)
        buffer.append(prel_class)
        if(len(classifications) == 0):
            classifications.append(prel_class)
        else:
            
            grass_count = buffer.count("grass")
            other_count = len(buffer) - grass_count
            
            if(classifications[-1] == "grass"):
                if(other_count > len(buffer) * qualified_majority):
                    classification = "other"
                else:
                    classification = "grass"
            else:
                if(grass_count > len(buffer) * qualified_majority):
                    classification = "grass"
                else:
                    classification = "other"
            classifications.append(classification)
    return np.array(classifications)


# Evaluation code
So how do we do the actual evaluation of the algorithm? In the "Loading and Preprocessing" section, there is a line of unused code:

evaluation_indexes = [8,13,16,19,22,23,30,31,36,43]

Since these indexes were not used in training of the algorithm, they are suitable for usage as evaluation data. We will write some code to evaluate the test data and get some metrics that we can further optimize.

Note: For development of a deployable algorithm, a significant portion of the data collection should be dedicated to the task of collecting an evaluation set.

In [None]:

model = tensorflow.keras.models.load_model("out_model.h5")

#Our Prediction, Actual surface
gg = 0 #Grass, Grass
og = 0 #Other, Grass
go = 0 #Grass, Other
oo = 0 #Other, Other

for ind in evaluation_indexes:
    label = get_label_from_ind(ind)
    if(label != "grass"):
        label = "other"
    
    data = load_data_from_ind(ind)
    pre_data = preprocess_data(data)
    predictions = model.predict(pre_data, verbose = 0)
    classes = postprocessing(predictions)

    total = len(classes)
    correct = sum(classes == label)
    miss = total - correct
    
    if(label == "grass"):
        gg += correct
        og += miss
    else:
        oo += correct
        go += miss
    
accuracy = (gg + oo) / (gg+oo+og+go)
correct_grass = gg / (og + gg)
miss_grass = og / (og + gg)
correct_other = oo / (go + oo)
miss_other = go / (go + oo)

print(f"Total Accuracy {accuracy}")
print(f"True positive rate: {correct_grass}, False positive rate: {miss_other}")
print(f"False negative rate: {miss_grass}, True negative rate: {correct_other}")

# Further optimization
Depending on the outcome of the neural network training, the numbers will vary. The "grass_threshold" parameter used in the postprocessing has a very high impact on the performance of the system. For a more ambitious system, there is room for writing an automatic optimization of the postprocessing based on the accuracy or any other evaluation metric. We are content with adjusting the postprocessor numbers manually.

Another parameter that is interesting to adjust is the window size for the preprocessor, or perhaps implement a more direct approach to the feature selection, and let the preprocessor handle more advanced feature extraction. 

It is also a good practice to not evaluate a single net with the given structure, but instead repeat the training process a few times and evaluate each net to make a more informed choice.

# Conclusions and reflection
We have a classifier as well as accompanying pre- and post- processing to effectively execute it. All of these steps have an impact on the final result, so in order to develop a deployable system, it is important to have a thourough understanding of all parameters. 

The dataset that was used for this training and evaluation was collected over three different sessions and a single type of grass. For a deployable system, a significantly larger dataset is required.

The actual structure of the network might be limited by constraints not considered in this document (memory size, processor speed, implementation availibility for example). The structure of the network should be evaluated under relevant constraints. The supplied structure is therefore only to be considered as a guideline.