SPDX-FileCopyrightText: SAS research group, HFT, Helmut Schmidt University

SPDX-License-Identifier: CC0-1.0

https://github.com/hsu-sonar/icua24-geopackage

## Simplifying trajectories

We typically do not want to store every INS point recorded by the sonar in the database. This notebook implements the trajectory simplification routine detailed in the paper.

In [None]:
import numpy as np
from matplotlib import pyplot as plt

%matplotlib ipympl

## Generating a test trajectory

For test purposes, we will generate a trajectory using a random walk to model the non-ideal motion of a system. The first step is to set some parameters of the trajectory.

In [None]:
length = 600  # Along-track length of the trajectory.
dx = 0.15  # Along-track distance between INS samples
walk_std = 0.1  # Standard deviation of the sideways steps in the walk. 
angle = 15  # Angle of trajectory from north towards east in degrees.

Initialise a pseudo-random number generator. You can add a positive integer seed value to the function to get a repeatable trajectory.

In [None]:
rng = np.random.default_rng()

Generate the along-track position of each sample, centered at 0.

In [None]:
along_track = np.arange(-length / 2, length / 2 + 1e-6, dx)

And then cumulatively sum a sequence of normally distributed variables to get the across-track position at each sample.

In [None]:
across_track = np.cumsum(rng.normal(0, walk_std, along_track.shape))

Rotate this into the eastings and northings with the desired heading.

In [None]:
_angle = np.radians(angle)
eastings = np.cos(_angle) * across_track + np.sin(_angle) * along_track
northings = -np.sin(_angle) * across_track + np.cos(_angle) * along_track

Plot this; the cross marks the start point.

In [None]:
plt.figure()
plt.plot(eastings, northings)
plt.plot(eastings[0], northings[0], "C0x")
plt.xlabel("Eastings (m)")
plt.ylabel("Northings (m)");

For ease of use, stack this into a 2d array.

In [None]:
ins_points = np.stack([eastings, northings], axis=1)

## Error metric

For our error metric, we will use the mean perpendicular distance between a set of points and the proposed line segment to replace them. The following function calculates this metric given the start and end points of the line, and the INS points to check. This uses a parametric representation to find the perpendicular points; see the *calculaitng line piece distances* notebook for more details.

In [None]:
def mean_perpendicular_distance(start, end, points):
    """Find the mean perpendicular distance between a line and a set of points.

    Parameters
    ----------
    start, end : numpy.ndarray
        The start and end point of the line.
    points : numpy.ndarray
        An Nx2 array of the points to calculate the distance from.

    Returns
    -------
    mean_distance : float
    
    """
    direction_vector = end - start
    offset_vectors = points - start

    t = np.sum(direction_vector * offset_vectors, axis=-1) / np.sum(
        direction_vector**2, axis=-1
    )

    P = start + t[:, np.newaxis] * direction_vector

    d = np.linalg.norm(P - points, axis=-1)

    return np.mean(d)

## Simplification

First, we need to calculate $L$, the cumulative distance travelled by the sonar to reach each point.

In [None]:
point_to_point = np.linalg.norm(np.diff(ins_points, axis=0), axis=1)

cumulative_distance = np.zeros_like(eastings)
cumulative_distance[1:] = np.cumsum(point_to_point)

Then we set the bounds of $L$ for each segment, the reduction factor $\alpha$ and our maximum error threshold.

In [None]:
L_max = 50.0
L_min = 10.0
alpha = 0.8
epsilon_max = 0.5

The following cell implements the simplification for the first segment.

In [None]:
# Index of the INS point which starts the segment.
start_index = 0

# Current segment length.
L = L_max

# Loop until we have determined the segment.
while True:
    # Find indices of points within the L metres after the start.
    from_start = cumulative_distance - cumulative_distance[start_index]
    within_L = np.where((from_start > 0) & (from_start <= L))[0]

    # Split into the index of the proposed end, and the points inbetween.
    end_index = within_L[-1]
    point_indices = within_L[:-1]

    # Get these points.
    start = ins_points[start_index]
    end = ins_points[end_index]
    points = ins_points[point_indices]

    # Calculate the error metric.
    epsilon = mean_perpendicular_distance(start, end, points)

    # Below the threshold: we have a good segment.
    if epsilon <= epsilon_max:
        break

    # Reached our minimum length.
    if np.isclose(L, L_min):
        break

    # Reduce the length, clipping to the minimum, and try again.
    L *= alpha
    if L < L_min:
        L = L_min

We can then plot this first segment and the original INS points it replaces.

In [None]:
plt.figure()
plt.scatter(*points.T, alpha=0.4, ec="none")
plt.plot([start[0], end[0]], [start[1], end[1]], "C1o-")
plt.xlabel("Eastings (m)")
plt.ylabel("Northings (m)");

We can then put another loop around this to find all segments. This forms the basis of the simplification module of the library.

In [None]:
# List of points in the simplified trajectory.
simplified = [ins_points[0]]

# Steps we took to find it.
steps = []

# Initialise some variables.
start_index = 0
segment_number = 1

# Loop until complete.
while True:
    # Start with our maximum length.
    trial_number = 1
    L = L_max
    while True:
        # Find indices of points within the L metres after the start.
        from_start = cumulative_distance - cumulative_distance[start_index]
        within_L = np.where((from_start > 0) & (from_start <= L))[0]

        # Split into the index of the proposed end, and the points inbetween.
        end_index = within_L[-1]
        point_indices = within_L[:-1]

        # Get these points.
        start = ins_points[start_index]
        end = ins_points[end_index]
        points = ins_points[point_indices]
    
        # Calculate the error metric.
        epsilon = mean_perpendicular_distance(start, end, points)
        steps.append([segment_number, trial_number, L, epsilon])
        trial_number += 1
    
        # Below the threshold: we have a good segment.
        if epsilon <= epsilon_max:
            break
    
        # Reached our minimum length.
        if np.isclose(L, L_min):
            break
    
        # Reduce the length, clipping to the minimum, and try again.   
        L *= alpha
        if L < L_min:
            L = L_min

    # Finished this segment.
    simplified.append(end)

    # Start the next segment at the end of this one.
    start_index = end_index
    segment_number += 1

    # If we are closer than the minimum distance to the end, we can just take the end
    # point and stop.
    if (cumulative_distance[-1] - cumulative_distance[start_index]) < L_min:
        simplified.append(ins_points[-1])
        break

# Turn the list of segment points into an array.
simplified = np.array(simplified)

We can see how many segments we ended up with:

In [None]:
simplified.shape[0]

And how many steps it took to get there.

In [None]:
len(steps)

The steps variable lists all these trials; each entry contains the segment number, the trial number, the segment length being tested, and the value of the error metric.

In [None]:
steps

We can now plot the original trajectory and the simplified version.

In [None]:
plt.figure()
plt.plot(*ins_points.T, "C0o", alpha=0.4, markeredgecolor="none", markersize=3)
plt.plot(simplified[:, 0], simplified[:, 1], "C1-o", markersize=4)
plt.xlabel("Eastings (m)")
plt.ylabel("Northings (m)");