**IA and locomotion: human gait analysis**

# Introduction

**Context**

The study of human gait is a central problem in medical research with far-reaching consequences in the public health domain.
This complex mechanism can be altered by a wide range of pathologies (such as Parkinson’s disease, arthritis, stroke,...), often resulting in a significant loss of autonomy and an increased risk of fall.
Understanding the influence of such medical disorders on a subject's gait would greatly facilitate early detection and prevention of those possibly harmful situations.
To address these issues, clinical and bio-mechanical researchers have worked to objectively quantify gait characteristics.


Among the gait features that have proved their relevance in a medical context, several are linked to the notion of step (step duration, variation in step length, etc.), which can be seen as the core atom of the locomotion process.
Many algorithms have therefore been developed to automatically (or semi-automatically) detect gait events (such as heel-strikes, heel-off, etc.) from accelerometer/gyrometer signals.

Most of the time, the algorithms used for step detection are dedicated to a specific population (healthy subjects, elderly subjects, Parkinson patients, etc.) and only a few publications deal with heterogeneous populations composed of several types of subjects.
Another limit to existing algorithms is that they often focus on locomotion in established regime (once the subject has initiated its gait) and do not deal with steps during U-turn, gait initiation or gait termination.
Yet, initiation and termination steps are particularly sensitive to pathological states.
For example, the first step of Parkinsonian patients has been described as slower and smaller that the first step of age-matched subjects.
U-turn steps are also interesting since 45% of daily living walking is made up of turning steps, and when compared to straight-line walking, turning has been emphasized as a high-risk fall situation.
This argues for reliable algorithms that could detect initiation, termination and turning steps in both healthy and pathological subjects.


**Step detection**

The objective is to recognize the **start and end times of footsteps** contained in accelerometer and gyrometer signals recorded with Inertial Measurement Units (IMUs).

## Setup

**Import**

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from loadmydata.load_human_locomotion import (
    get_code_list,
    load_human_locomotion_dataset,
)

from convsparsecoder import ConvSparseCoder
from locogram import get_locogram
from utils import pad_at_the_end, plot_CDL, plot_steps

**Utility functions**

In [None]:
def get_signal(sensor_data, dim_name="LRY") -> np.ndarray:
    """Select a signal from a given trial."""
    # choose a single dimension
    signal = sensor_data.signal[dim_name].to_numpy()
    signal -= signal.mean()
    signal /= signal.std()
    return signal

In [None]:
def fig_ax(figsize=(15, 3)):
    return plt.subplots(figsize=figsize)

# Data description

## Data collection and clinical protocol

#### Participants

The data was collected between April 2014 and October 2015 by monitoring healthy (control) subjects and patients from several medical departments (see [publication](#Publication) for more information).
Participants are divided into three groups depending on their impairment:
- **Healthy** subjects had no known medical impairment.
- The **orthopedic group** is composed of 2 cohorts of distinct pathologies: lower limb osteoarthrosis and cruciate ligament injury.
- The **neurological group** is composed of 4 cohorts: hemispheric stroke, Parkinson's disease, toxic peripheral neuropathy and radiation induced leukoencephalopathy.

Note that certain participants were recorded on multiple occasions, therefore several trials may correspond to the same person.
In the training set and in the testing set, the proportion of trials coming from the "healthy", "orthopedic" and "neurological" groups is roughly the same, 24%, 24% and 52% respectively.

#### Protocol and equipment

All subjects underwent the same protocol described below. First, a IMU (Inertial Measurement Unit) that recorded accelerations and angular velocities was attached to each foot.
All signals have been acquired at 100 Hz with two brands of IMUs: XSens&trade; and Technoconcept&reg;.
One brand of IMU was attached to the dorsal face of each foot.
(Both feet wore the same brand.)
After sensor fixation, participants were asked to perform the following sequence of activities:
- stand for 6 s,
- walk 10 m at preferred walking speed on a level surface to a previously shown turn point,
- turn around (without previous specification of a turning side),
- walk back to the starting point,
- stand for 2 s.

Subjects walked at their comfortable speed with their shoes and without walking aid.
This protocol is schematically illustrated in the following figure.


<div style="text-align: center">
<img src="https://raw.githubusercontent.com/ramp-kits/human_locomotion/master/images/protocol-schema.png" width="500px">
</div>


Each IMU records its acceleration and angular velocity in the $(X, Y, Z, V)$ set of axes defined in the following figure.
The $V$ axis is aligned with gravity, while the $X$, $Y$ and $Z$ axes are attached to the sensor.
<div style="text-align: center">
<img src="https://raw.githubusercontent.com/ramp-kits/human_locomotion/master/images/sensor-photo.png" width="500px">
</div>

<div style="text-align: center">
<img src="https://raw.githubusercontent.com/ramp-kits/human_locomotion/master/images/sensor-position.png" width="500px">
</div>

## Step detection in a clinical context

The following schema describes how step detection methods are integrated in a clinical context.
<br/><br/>
<div style="text-align: center">
<img src="https://raw.githubusercontent.com/ramp-kits/human_locomotion/master/images/step-detection-schema.png" width="500px">
</div>

(1) During a trial, sensors send their own acceleration and angular velocity to the physician's computer.

(2) A software on the physician's computer synchronizes the data sent from both sensors and produces two multivariate signals (of same shape), each corresponding to a foot.


A step detection procedure is applied on each signal to produce two lists of footsteps (one per foot/sensor).
The numbers of left footsteps and right footsteps are not necessarily the same.
Indeed, subjects often have a preferred foot to initiate and terminate a walk or a u-turn, resulting in one or more footsteps from this preferred foot.
The starts and ends of footsteps are then used to create meaningful features to characterize the subject's gait.

## Data exploration

During a trial, a subject executes the protocol described above.
This produces two multivariates signals (one for each foot/sensor) and for each signal, a number of footsteps have be annotated.
In addition, information (metadata) about the trial and participant are provided.
All three elements (signal, step annotation and metadata) are detailled in this section.

In [None]:
# This wil download the data on the first run
trial_1 = load_human_locomotion_dataset("17-2")
trial_2 = load_human_locomotion_dataset("56-2")

print(trial_1.description)

### Signal

Each IMU that the participants wore provided $\mathbb{R}^{8}$-valued signals, sampled at 100 Hz.
In this setting, each dimension is defined by the foot (`L` for left, `R` for right), the signal type (`A` for acceleration, `R` for angular velocity) and the axis (`X`, `Y`, `Z` or `V`).
For instance, `RRX` denotes the angular velocity around the `X`-axis of the right foot.
Accelerations are given in $m/s^2$ and angular velocities, in $deg/s$.
The signal is available in the `.signal` attribute as a `Pandas` dataframe.

Note that this multivariate signal originates from a two sensors (one on each foot).

In [None]:
# The signal is available in the `signal` attribute.

fig, (ax_0, ax_1) = plt.subplots(nrows=1, ncols=2, figsize=(20, 3))

# Here we show the left foot (`L`)
trial_1.signal[["LAX", "LAY", "LAZ", "LAV"]].plot(
    ax=ax_0
)  # select the accelerations
trial_1.signal[["LRX", "LRY", "LRZ", "LRV"]].plot(
    ax=ax_1
)  # select the angular velocities

trial_1.signal.head()

The "flat part" at the beginning of each dimension is the result of the participants standing still for a few
seconds before walking (see [Protocol](#Protocol-and-equipment)).
The same behaviour can be seen at the end of each dimension (often but not always), though for a quite smaller duration.

###  Metadata
A number of metadata (either numerical or categorical) are provided for each sensor recording, detailing the participant being monitored and the sensor position:

- `trial_code`: unique identifier for the trial;
- `age` (in years);
- `gender`: male ("M") or female ("F");
- `height` (in meters);
- `weight` (in kilograms);
- `bmi` (in kg/m2): body mass index;
- `laterality`: subject's "footedness" or "foot to kick a ball" ("Left", "Right" or "Ambidextrous").
- `sensor`: brand of the IMU used for the recording (“XSens” or “TCon”);
- `pathology_group`: this variable takes value in {“Healthy”, “Orthopedic”, “Neurological”};
- `is_control`: whether the subject is a control subject ("Yes" or "No");
- `foot`: foot on which the sensor was attached ("Left" or "Right").

These are accessible using the notation `sensor_data.metadata`.

In [None]:
print("-" * 10 + " Trial 1 " + "-" * 10)
print(trial_1.metadata)
print("-" * 10 + " Trial 2 " + "-" * 10)
print(trial_2.metadata)

### Step annotation (the "label" to predict)
Footsteps were manually annotated by specialists using a software that displayed the signals from the relevant sensor (left or right foot) and allowed the specialist to indicate the starts and ends of each step.

A footstep is defined as the period during which the foot is moving.
Footsteps are separated by periods when the foot is still and flat on the floor.
Therefore, in our setting, a footstep starts with a heel-off and ends with the following toe-strike of the same foot.


Footsteps (the "label" to predict from the signal) are contained in a list whose elements are list of two integers, the start and end indexes. For instance:

In [None]:
# left foot
print(trial_1.left_steps)

In [None]:
msg = f"{trial_1.left_steps.shape[0]} footsteps were annotated on the left foot, and {trial_1.right_steps.shape[0]} on the right."
print(msg)

# plot steps
plot_steps(sensor_data=trial_1, left_or_right="right", choose_step=3)

Visualization of footsteps and signals:

**On the first two plots.**
The repeated patterns (colored in light green) correspond to periods when the foot is moving.
During the non-annotated periods, the foot is flat and not moving and the signals are constant.
Generally, steps at the beginning and end of the recording, as well as during the u-turn (in the middle of the signal approximatively, see [Protocol](#Protocol-and-equipment)) are a bit different from the other ones.

**On the last two plots.** A close-up on a single footstep.

### Locograms

In [None]:
_, (ax_left, ax_right) = plt.subplots(ncols=2, figsize=(15, 5))

locogram = get_locogram(sensor_data=trial_2, left_or_right="left")
_ = sns.heatmap(1 - locogram, ax=ax_left)
ax_left.set_title("Left foot")

locogram = get_locogram(sensor_data=trial_2, left_or_right="right")
_ = sns.heatmap(1 - locogram, ax=ax_right)
_ = ax_right.set_title("Right foot")

In [None]:
_, (ax_left, ax_right) = plt.subplots(ncols=2, figsize=(15, 5))

locogram = get_locogram(sensor_data=trial_1, left_or_right="left")
_ = sns.heatmap(1 - locogram, ax=ax_left)
ax_left.set_title("Left foot")

locogram = get_locogram(sensor_data=trial_1, left_or_right="right")
_ = sns.heatmap(1 - locogram, ax=ax_right)
_ = ax_right.set_title("Right foot")

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Compare the left and right locograms.</p>
</div>

# Step detection

## Toy example

### Simulate patterns

In [None]:
# fmt: off
# synthetic atoms
atom_1 = np.array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
atom_2 = np.array([0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 3.33066907e-16, 6.06060606e-02, 1.21212121e-01, 1.81818182e-01, 2.42424242e-01, 3.03030303e-01, 3.63636364e-01, 4.24242424e-01, 4.84848485e-01, 5.45454545e-01, 6.06060606e-01, 6.66666667e-01, 7.27272727e-01, 7.87878788e-01, 8.48484848e-01, 9.09090909e-01, 9.69696970e-01, 1.03030303e00, 1.09090909e00, 1.15151515e00, 1.21212121e00, 1.27272727e00, 1.33333333e00, 1.39393939e00, 1.45454545e00, 1.51515152e00, 1.57575758e00, 1.63636364e00, 1.69696970e00, 1.75757576e00, 1.81818182e00, 1.87878788e00, 1.93939394e00, 5.55111512e-16, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0])
# fmt: on

atom_width = atom_1.shape[0]

fig, (ax_left, ax_right) = plt.subplots(nrows=1, ncols=2, figsize=(10, 3))
ax_left.plot(atom_1, "k", lw=1)
ax_left.set_title("Atom 1")
ax_right.plot(atom_2, "k", lw=1)
_ = ax_right.set_title("Atom 2")

### Simulate activations

In [None]:
n_samples = 1000

# random activitions
activations_1 = np.random.binomial(n=1, p=0.005, size=n_samples)
activations_2 = np.random.binomial(n=1, p=0.001, size=n_samples)

fig, ax = fig_ax()
_ = ax.plot(activations_1, label="activations 1")
_ = ax.plot(activations_2, label="activations 2")

plt.legend()

signal = np.convolve(activations_1, atom_1, mode="same")
signal += np.convolve(activations_2, atom_2, mode="same")

fig, ax = fig_ax()
_ = ax.plot(signal)

### Sparse coding

The optimization problem to find the activations is:
$$
        \mathbf{Z}^\star = \arg\min_{(\mathbf{z}_k)} \left\| \mathbf{x} - \sum_{k=1}^K (\mathbf{z}_k \star \mathbf{d}_k )\right\|_2^2 \quad + \quad \lambda \sum_{k=1}^K \|\mathbf{z}_k \|_1
$$

where

- $\mathbf{x}=[x_1, x_2, \dots, x_T]$ is a univariate signal with $T$ samples;
- $\mathbf{d}_k$ ($k=1,\dots,K$) are $K$ patterns of length $L$;
- $\mathbf{z}_k$ of length $N-L+1$ is the activation signal of pattern $\mathbf{d}_k$;
- $\lambda>0$ controls the regularization.

In [None]:
# concatenate atoms
atoms = np.c_[atom_1, atom_2].T
# sparse coding
coder = ConvSparseCoder(atoms=atoms, positive_code=True).fit(
    signal=signal, penalty=10
)
sparse_codes = coder.sparse_codes
reconstruction = coder.predict()

Look at the reconstruction.

In [None]:
fig, ax = fig_ax()

ax.plot(signal, label="Original")
ax.plot(reconstruction, label="Reconstruction")
ax.set_title(f"MSE: {((signal-reconstruction)**2).mean():.3f}")
_ = plt.legend()

Look at activations.

In [None]:
plot_CDL(
    signal,
    codes=sparse_codes,
    atoms=atoms,
)

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>What do you observe when the penalty increases?</p>
</div>

## On real-world data

We apply the same methodology on the Gait data set.

We only consider two trials with different walking patterns.

In [None]:
signal_1 = get_signal(trial_1, dim_name="LRY")
signal_2 = get_signal(trial_2, dim_name="LRY")

# take an arbitrary footstep
start, end = trial_1.left_steps[4]
template_1 = signal_1[start:end]
template_1 -= template_1.mean()
template_1 /= template_1.std()

start, end = trial_2.left_steps[7]
template_2 = signal_2[start:end]
template_2 -= template_2.mean()
template_2 /= template_2.std()

# Pad the atoms to the same length
template_length = max(template_1.shape[0], template_2.shape[0])
template_1 = pad_at_the_end(template_1, desired_length=template_length)
template_2 = pad_at_the_end(template_2, desired_length=template_length)

In [None]:
# instantiate the convolutional sparse coder
coder = ConvSparseCoder(
    atoms=np.c_[template_1, template_2].T, positive_code=True
)

plt.plot(coder.atoms[0], label="Atom 1")
plt.plot(coder.atoms[1], label="Atom 2")
_ = plt.legend()

In [None]:
# Sparse coding
reconstruction_1 = coder.fit(signal=signal_1, penalty=20).predict()
# Plot
plot_CDL(
    signal_1,
    codes=coder.sparse_codes,
    atoms=coder.atoms,
)

In [None]:
# Sparse coding
reconstruction_2 = coder.fit(signal=signal_2, penalty=80).predict()
# Plot
plot_CDL(
    signal_2,
    codes=coder.sparse_codes,
    atoms=coder.atoms,
)

<div class="alert alert-success" role="alert">
    <p><b>Question</b></p>
    <p>Comment on the activations patterns.</p>
</div>