# Foot Up (FU) and Foot Down (FD) Preprocessing v1

This notebook loads processed radar data, labels frames based on foot up (FU) and foot down (FD) events, creates 100-frame windows, and saves the preprocessed data for model training.

## Overview

- **Load** processed radar data from the specified directory.
- **Label** frames using event information from the metadata CSV file.
- **Window** the data into 100-frame segments with overlap.
- **Save** the preprocessed data for each participant.


### Load Libraries 

In [1]:
import os
os.environ["MKL_NUM_THREADS"] = "1"
os.environ["NUMEXPR_NUM_THREADS"] = "1"
os.environ["OMP_NUM_THREADS"] = "1"

import numpy as np
import pandas as pd
import torch



: 

: 

## Define Helper Functions

### Function: `load_participant_ids`

This function retrieves unique participant IDs from the metadata CSV file by extracting the first two characters of the `RADAR_capture` column.

In [None]:
def load_participant_ids(metadata_csv):
    """
    Get unique participant IDs from the metadata CSV file based on the first two characters of 'RADAR_capture'.
    """
    df = pd.read_csv(metadata_csv)
    participant_ids = df['RADAR_capture'].str[:2].unique()
    return participant_ids

### Function: `label_frames`

This function assigns labels to frames based on event timings for a given radar capture.

- **Label 0**: Foot Up (GOUP)
- **Label 1**: Foot Down (DOWN)
- **Label 2**: Stability

In [None]:
def label_frames(event_labels_df, radar_capture, num_frames):
    """
    Label frames based on event timings for the given radar capture.
    """
    labels = np.full(num_frames, 2)  # Default to 'stability' label (2)

    capture_events = event_labels_df[event_labels_df['RADAR_capture'] == radar_capture]
    goup_ranges = []
    down_ranges = []

    for _, event in capture_events.iterrows():
        if not pd.isna(event['frame_foot_up']) and not pd.isna(event['frame_stable']):
            start = int(event['frame_foot_up'])
            end = int(event['frame_stable'])
            labels[start:end] = 0  # GOUP
            goup_ranges.append((start, end))
        if not pd.isna(event['frame_break']) and not pd.isna(event['frame_end']):
            start = int(event['frame_break'])
            end = int(event['frame_end'])
            labels[start:end] = 1  # DOWN
            down_ranges.append((start, end))

    return labels, goup_ranges, down_ranges

### Function: `create_windows`

This function creates overlapping windows of data and applies the corresponding labels to each window.

- **Parameters:**
  - `data`: The processed radar data tensor.
  - `labels`: The array of labels corresponding to each frame.
  - `metadata`: A dictionary containing metadata for the capture.
  - `window_size`: The size of each window (default is 100 frames).
  - `overlap`: The number of frames that overlap between consecutive windows.


In [None]:
def create_windows(data, labels, metadata, window_size=100, overlap=50):
    """
    Create windows of data with specified overlap and apply labels.
    """
    actuator_start_frame, actuator_end_frame = metadata['frame_range']
    num_windows = 1 + (actuator_end_frame - actuator_start_frame - window_size) // (window_size - overlap)

    windows_data = []
    windows_labels = []
    windows_ranges = []

    for w in range(num_windows):
        start = w * (window_size - overlap) + actuator_start_frame
        end = start + window_size
        window_range = {'window_start_frame': start, 'window_end_frame': min(end, actuator_end_frame)}

        if end > actuator_end_frame:
            padding_length = end - actuator_end_frame
            window_data = torch.cat((data[start:actuator_end_frame], torch.zeros(padding_length, *data.shape[1:])), dim=0)
            window_labels = np.pad(labels[start:actuator_end_frame], (0, padding_length), 'constant', constant_values=-1)
        else:
            window_data = data[start:end]
            window_labels = labels[start:end]

        windows_data.append(window_data.unsqueeze(0))
        windows_labels.append(torch.tensor(window_labels).unsqueeze(0))
        windows_ranges.append(window_range)

    # Concatenate to tensors
    windows_data_tensor = torch.cat(windows_data, dim=0)
    windows_labels_tensor = torch.cat(windows_labels, dim=0)

    return windows_data_tensor, windows_labels_tensor, windows_ranges

## Main Preprocessing Function

### Function: `preprocess_data`

This function orchestrates the preprocessing steps:

- Loads the metadata and participant IDs.
- Iterates over each participant and their radar captures.
- Loads processed radar data.
- Labels frames.
- Creates windows.
- Saves the preprocessed data.

In [None]:

def preprocess_data(root_dir, metadata_csv, output_dir, window_size=100, overlap=50):
    """
    Main preprocessing function to load, label, window, and save data for each participant.
    """
    # Load metadata
    participant_ids = load_participant_ids(metadata_csv)
    event_labels_df = pd.read_csv(metadata_csv)

    # Process each participant's data
    for participant_id in participant_ids:
        participant_dir = os.path.join(root_dir, participant_id)
        if not os.path.exists(participant_dir):
            print(f"Participant directory {participant_dir} does not exist. Skipping.")
            continue  # Skip if directory does not exist

        for file in sorted(os.listdir(participant_dir)):
            if file.endswith('.npy'):
                filepath = os.path.join(participant_dir, file)
                radar_capture = "_".join(file.split('_')[:-1])
                channel_number = filepath.split(".")[-2].split("channel")[-1]

                if event_labels_df['RADAR_capture'].str.contains(radar_capture).any():
                    data = torch.from_numpy(np.load(filepath)).float()

                    # Get number of frames from data shape
                    num_frames = data.shape[0]

                    # Label frames for this capture
                    labels, goup_ranges, down_ranges = label_frames(event_labels_df, radar_capture, num_frames)

                    # Extract metadata
                    capture_info = event_labels_df[event_labels_df['RADAR_capture'] == radar_capture].iloc[0]
                    actuator_start_frame = int(capture_info['RADAR_Start_Frame'])
                    actuator_end_frame = int(capture_info['RADAR_End_Frame'])
                    metadata = {
                        'channel_number': channel_number,
                        'frame_range': (actuator_start_frame, actuator_end_frame),
                        'RADAR_capture': radar_capture,
                        'GOUP_ranges': goup_ranges,
                        'DOWN_ranges': down_ranges
                    }

                    # Create windows
                    windows_data, windows_labels, windows_ranges = create_windows(
                        data, labels, metadata, window_size, overlap
                    )

                    # Save windows and metadata
                    output_participant_dir = os.path.join(output_dir, participant_id)
                    os.makedirs(output_participant_dir, exist_ok=True)
                    output_file = os.path.join(output_participant_dir, f"{radar_capture}_windows.pt")
                    torch.save({
                        'data': windows_data,
                        'labels': windows_labels,
                        'ranges': windows_ranges,
                        'metadata': metadata
                    }, output_file)

                    print(f"Saved preprocessed data for {radar_capture} to {output_file}")

## Run the Preprocessing Pipeline

Adjust the paths and parameters as needed to match your data directory structure.

In [None]:
# Paths and parameters
root_dir = "/Volumes/FourTBLaCie/RADARTreePose_Data/processed/radar/RDMs_npy_by_channel_v1"
metadata_csv = "/Volumes/FourTBLaCie/RADARTreePose_Data/metadata/FULL_MNTRR_MNTRL_key_times_frames.csv"
output_dir = "/Volumes/FourTBLaCie/RADARTreePose_Data/preprocessed/radar"
window_size = 100
overlap = 10  # Adjust overlap as needed

# Run preprocessing
preprocess_data(root_dir, metadata_csv, output_dir, window_size, overlap)

## Notes

- **Data Assumptions:**
  - The number of frames (`num_frames`) is determined from the shape of the loaded radar data.
  - Radar captures are stored in `.npy` files within participant-specific directories.
  - The metadata CSV contains event timing information necessary for labeling.

- **Labels:**
  - **Label 0**: Foot Up (GOUP)
  - **Label 1**: Foot Down (DOWN)
  - **Label 2**: Stability
  - **Label -1**: Padding (used when windows extend beyond the capture's end frame)

- **Saving Data:**
  - Preprocessed data is saved as `.pt` files using PyTorch's `torch.save` function.
  - Each participant has their own directory within the `output_dir`.

---

## Conclusion

This notebook processes the radar data by:

1. Loading processed radar data for each participant.
2. Labeling frames based on foot up/down events.
3. Creating overlapping windows of data.
4. Saving the preprocessed data for use in model training.
