The goal of this notebook is to get the delay between the kinect stream and the other streams in the .xdf file. 

# The problem

Due to a bug in LSL_Kinect, the timestamps in the kinect streams are relative to the computer's startup time, whereas timestamps should be defined by LSL to enable synchronization with other streams.  
As a consequence, the kinect streams are delayed compared to the other streams, and the value of the delay is unknown (but constant).

# The solution
Hopefully, the kinect streams are also saved in .csv files, and the mouse streams are saved in other .csv files.

The solution is to use the .csv files corresponding to the .xdf file so to  :
- get the kinect-to-csv delay : the delay between the kinect stream and the corresponding csv file
- get the mouse-to-csv delay : the delay between mouse stream and the corresponding csv file  
- compute the kinect-to-mouse delay : the delay between the kinect stream and the mouse stream

Then, we can :
- compute the corrected kinect timestamps : the timestamps of the kinect stream shifted by the kinect-to-mouse delay



In [None]:
# this should be the set to false for production use
# and set to true for testing (default) but can be already set to false from outside (e.g., by the test script)

if "doRunTests" not in globals():
    doRunTests = True
    do_debug = True

## Get the list of the needed csv files for the Reaching and Circle task

We need the list of the .csv files corresponding to each .xdf file.  
We use the list given by `goodFiles.log` in the visit directory.  
Only the Reaching and Circle tasks contain an xdf file with a kinect stream.  

NOTE : for the mouse, we shall not use the data csv files, but only the marker csv files. For the kinect, we shall use the data and marker csv files (again a bug in LSL_Kinect).

The csv files that are expected are: 

- the kinect csv files (both are mandatory) 
    - `*_k.csv`: kinect data file 
    - `*_k_m.csv`: the kinect marker file

- the mouse csv marker files for the Reaching task (at least one is mandatory) 
    - `*_r_l_m_mau_np.csv`: mouse marker file for maximal arm use with the non-paretic arm
    - `*_r_l_m_mau_p.csv`: mouse marker file for maximal arm use with the paretic arm
    - `*_r_l_m_sau_np.csv`: mouse marker file for spontaneous arm use with the non-paretic arm
    - `*_r_l_m_sau_p.csv`: mouse marker file for spontaneous arm use with the paretic arm

- the mouse csv marker files for the Circle task (at least one is mandatory) 
    - `*_c_l_m_np.csv`: mouse marker file for the non-paretic arm
    - `*_c_l_m_p.csv`: mouse marker file for the paretic arm


The minimal set of files that are mandatory **for each task** are:
- the xdf file (the file that contains the kinect stream with wrong timestamps + the mouse stream with correct timestamps) 
- the kinect data csv file (the file that contains the kinect stream with correct timestamps)
- the kinect marker csv file (the file that contains the kinect marker stream with correct timestamps)
- one mouse marker csv file (the file that contains the mouse stream with correct timestamps)


In [None]:
# we need the expected file list as created by the checkFilesInVisit notebook

# save current doRunTests value
doRunTestsOld = doRunTests

# run the outside notebooks (to get the functions)
doRunTests = False  # we need this to avoid running the tests in the outside notebooks
%run -i "checkFilesInVisit.ipynb"  # we need this to get the functions

# restore the doRunTests value
doRunTests = doRunTestsOld


In [None]:
import os
import numpy as np


def load_goodFiles(visit_path):
    """load the goodFiles.log file"""

    fullFname_goodFiles = os.path.join(visit_path, "goodFiles.log")
    goodFiles = []
    with open(fullFname_goodFiles, "r") as f:
        for line in f:
            goodFiles.append(line.strip())

    return goodFiles


def get_xdf_files(goodFiles):
    """get the xdf files from the goodFiles list"""
    # should be 2 xdf files: *r.xdf and *c.xdf
    xdf_files = []
    for f in goodFiles:
        if f.endswith(".xdf"):
            xdf_files.append(f)

    return xdf_files


def get_csv_files_for_xdf(xdf_file, goodFiles):
    """get the csv files that are associated with the xdf file"""
    csv_files = []
    xdf_fname_no_ext = os.path.splitext(xdf_file)[0]
    task_token = xdf_fname_no_ext.split("_")[-1]
    for f in goodFiles:
        if f.endswith(".csv"):
            if "_" + task_token + "_" in f:
                csv_files.append(f)

    return csv_files


def get_expected_file_endings(visit_path):
    """get the expected file endings for the visit using the ExpectedRearmVisit class"""
    expected_files = ExpectedRearmVisit(visit_path).expectedFiles
    expected_files = [
        x for x in expected_files if "Reaching" in x[0] or "Circle" in x[0]
    ]

    expected_endings = {"Reaching": [], "Circle": []}
    for expected_task_list in expected_files:
        if "Reaching" in expected_task_list[0]:
            for fname in expected_task_list[1]:
                if fname.endswith(".csv"):
                    expected_endings["Reaching"].append(fname.split("V?")[-1])
        if "Circle" in expected_task_list[0]:
            for fname in expected_task_list[1]:
                if fname.endswith(".csv"):
                    expected_endings["Circle"].append(fname.split("V?")[-1])

    return expected_endings


def check_expected_csv_files(xdf_csv_files):
    """Check if the expected csv files are present for the xdf files
    and return a message with the errors found (if any)"""

    expected_endings = get_expected_file_endings(visit_path)
    msg = ""

    for xdf_csv in xdf_csv_files:
        xdf_fname = xdf_csv["xdf"]
        csv_files = xdf_csv["csv"]
        if xdf_fname.endswith("_r.xdf"):
            expected_ends = expected_endings["Reaching"]
        elif xdf_fname.endswith("_c.xdf"):
            expected_ends = expected_endings["Circle"]
        else:
            msg += f"Unexpected xdf file ending for {xdf_fname}\n"
            continue

        for ending in expected_ends:
            found = False
            for csv in csv_files:
                if csv.endswith(ending):
                    found = True
                    break
            if not found:
                msg += f"Expected file ending {ending} not found for {xdf_fname}\n"

    if xdf_csv_files is None or len(xdf_csv_files) == 0:
        msg = "No xdf files found\n"

    if len(xdf_csv_files) == 1:
        msg = f"Only one xdf file found (expecting two xdf files) \n"

    return msg


def check_expected_csv_files_for_one_xdf(xdf_file, csv_files):
    """Check if the expected csv files are present for the xdf file
    and return a message with the errors found (if any)"""

    expected_endings = get_expected_file_endings(visit_path)
    msg = ""

    if xdf_file.endswith("_r.xdf"):
        expected_ends = expected_endings["Reaching"]
    elif xdf_file.endswith("_c.xdf"):
        expected_ends = expected_endings["Circle"]
    else:
        msg += f"Unexpected xdf file ending for {xdf_file}\n"
        return msg

    for ending in expected_ends:
        found = False
        for csv in csv_files:
            if csv.endswith(ending):
                found = True
                break
        if not found:
            msg += f"Expected file ending {ending} not found for {xdf_file}\n"

    return msg


def get_xdf_and_corresponding_csv_files_in_file_list(goodFiles_list):
    """Get the xdf and corresponding csv files in the goodFiles list"""

    # check that gooFiles is a list of strings
    if not all(isinstance(x, str) for x in goodFiles_list):
        raise ValueError("goodFiles should be a list of strings")

    xdf_files = get_xdf_files(goodFiles_list)
    xdf_csv_files = []
    for f in xdf_files:
        csv_files = get_csv_files_for_xdf(f, goodFiles_list)
        xdf_csv_files.append({"xdf": f, "csv": csv_files})

    return xdf_csv_files


def check_mandatory_files_for_one_xdf(xdf_rel_fname, csv_files):
    """Check if the mandatory files are present for one xdf file
    and return a message with the errors found (if any)"""

    # I use wildcards in mandatory_patterns => fnmatch is needed
    import fnmatch

    mandatory_patterns = {
        "r": ["*_r_k.csv", "*_r_k_m.csv", "*_r_l*.csv"],
        "c": ["*_c_k.csv", "*_c_k_m.csv", "*_c_l*.csv"],
    }

    msg = ""

    if xdf_rel_fname.endswith("_r.xdf"):
        mandatory_fname_pattern = mandatory_patterns["r"]
    elif xdf_rel_fname.endswith("_c.xdf"):
        mandatory_fname_pattern = mandatory_patterns["c"]
    else:
        msg += f"Unexpected xdf file ending for {xdf_rel_fname}\n"
        return msg

    for pattern in mandatory_fname_pattern:
        found = False
        for fname in csv_files:
            if fnmatch.fnmatch(fname, pattern):
                found = True
                break
        if not found:
            msg += f"Mandatory file {pattern} not found for {xdf_rel_fname}\n"

    return msg


def check_mandatory_files(xdf_csv_files):
    """Check if the mandatory files are present for all xdf files
    and return a message with the errors found (if any)"""

    msg = ""

    if xdf_csv_files is None or len(xdf_csv_files) == 0:
        return "No xdf files found\n"

    if len(xdf_csv_files) == 1:
        return f"Only one xdf file found (expecting two xdf files) \n"

    for xdf_csv in xdf_csv_files:
        xdf_file = xdf_csv["xdf"]
        csv_files = xdf_csv["csv"]
        msg += check_mandatory_files_for_one_xdf(xdf_file, csv_files)

    return msg


def get_xdf_and_csv_files_in_visit(visit_path):
    """Get the xdf and corresponding csv files in the visit folder

    Returns:
    - xdf_csv_files: list of dictionaries with xdf and csv files
    - mandatory_files_error: error message if mandatory files are missing
    - expected_files_warning: warning message if expected files are missing
    """

    goodFiles = load_goodFiles(visit_path)
    xdf_csv_files = get_xdf_and_corresponding_csv_files_in_file_list(goodFiles)
    mandatory_files_error = check_mandatory_files(xdf_csv_files)
    expected_files_warning = check_expected_csv_files(xdf_csv_files)

    # return xdf_csv_files, mandatory_files_error, expected_files_warning
    return {
        "xdf_csv_files": xdf_csv_files,
        "error": mandatory_files_error,
        "warning": expected_files_warning,
    }


def get_xdf_and_csv_files_in_visit_by_xdf(visit_path):
    """Get the xdf and corresponding csv files in the visit folder
    
    Returns:
    xdf_and_csv_files: list of dictionaries with xdf and csv files
    
    Each dictionary has the following keys
    - xdf: xdf file name
    - csv: list of csv files
    - mandatory_files_error: error message if mandatory files are missing
    - expected_files_warning: warning message if expected files are missing
    """

    xdf_and_csv_files = []

    goodFiles = load_goodFiles(visit_path)
    xdf_files = get_xdf_files(goodFiles)

    if len(xdf_files) == 0:
        return [
            {
                "xdf": [],
                "csv": [],
                "mandatory_files_error": "No xdf files found",
                "expected_files_warning": "All expected files are missing",
            }
        ]

    for xdf_file in xdf_files:
        csv_files = get_csv_files_for_xdf(xdf_file, goodFiles)
        mandatory_files_error = check_mandatory_files_for_one_xdf(xdf_file, csv_files)
        expected_files_warning = check_expected_csv_files_for_one_xdf(
            xdf_file, csv_files
        )

        xdf_and_csv_files.append(
            {
                "xdf": xdf_file,
                "csv": csv_files,
                "mandatory_files_error": mandatory_files_error,
                "expected_files_warning": expected_files_warning,
            }
        )

    return xdf_and_csv_files


def print_xdf_and_csv_files(xdf_and_csv_files):
    """Print the xdf and csv files list of dictionaries"""
    for xdf_csv_file in xdf_and_csv_files:
        print("")
        for key, value in xdf_csv_file.items():
            if isinstance(value, (list, tuple)):
                print(f"{key}:")
                for elem in value:
                    print(f"  {elem}")
            else:
                print(f"{key}:\n  {value}")
    return


if doRunTests:
    ##########################################################################################
    visit_path = "../dat/ReArm.lnk/ReArm_C1P02/ReArm_C1P02_20210306_V1"
    # visit_path = "../dat/ReArm.lnk/ReArm_C1P02/ReArm_C1P02_20210419_V2"
    # visit_path = "../dat/ReArm.lnk/ReArm_C1P02/ReArm_C1P02_20210715_V3"
    # visit_path = "../dat/ReArm.lnk/ReArm_C1P07/ReArm_C1P07_20210716_V1"
    # visit_path = "../dat/ReArm.lnk/ReArm_C1P07/ReArm_C1P07_20210820_V2"
    # visit_path = "../dat/ReArm.lnk/ReArm_C1P07/ReArm_C1P07_20211116_V3"
    ##########################################################################################

    xdf_and_csv_files = get_xdf_and_csv_files_in_visit_by_xdf(visit_path)
    print_xdf_and_csv_files(xdf_and_csv_files)

## Read a marker csv file

We have to read a marker csv file and return a list of [timestamp, marker] pairs.   
We want to read both kinect and mouse marker csv files, but the kinect markers lack the timestamp column.   


In [None]:
import numpy as np
from datetime import datetime
import tempfile


def readMarkerCsv(fnameMarkerCsv):
    """Read a marker csv file and return a list of [timestamp, marker] pairs

    Parameters
    ----------
    fnameMarkerCsv : str
        Full filename of the marker csv file

    Returns
    -------

    out : list
        List of [timestamp, marker] pairs
            timestamp : float (in seconds)
            marker : str
    """

    if not fnameMarkerCsv.endswith(".csv"):
        raise ValueError("The file must be a csv file")

    with open(fnameMarkerCsv, "r") as fname:
        txt = fname.readlines()

    lines = [line.split(",") for line in txt]

    # PROBLEM: kinect markers lack the timestamp column
    # a marker is a line with 3 tokens : datetime,timestamp,marker \n
    # ACTION : add the timestamp column if we have only 2 columns
    for i in range(len(lines)):
        if len(lines[i]) == 2 and lines[i][0][0].isdigit():
            # create a timestamp that is in milliseconds (as in mouse markers)
            timestamp = (
                datetime.strptime(lines[i][0], "%Y-%m-%d %H:%M:%S.%f").timestamp()
                * 1000
            )
            lines[i].insert(1, str(timestamp))
            txt[i] = ",".join(lines[i])

    # PROBLEM: mouse markers lack the quotes around the multiline markers
    # find the lines where token 3 is "\n" = start of a multiline marker
    for i in range(len(lines)):
        line = lines[i]
        # find the start of a multiline marker
        if len(line) == 3 and line[2] == "\n":
            # add a " before the end of the line
            txt[i] = txt[i][:-1] + '"\n'
            # find the end of the multiline marker
            for j in range(i + 1, len(lines)):
                # if we have a normal one-line-marker
                if len(lines[j]) == 3:
                    # add a " before the end of the line (of the previous line)
                    txt[j - 1] = txt[j - 1][:-1] + '"\n'
                    break
                # if are at the end of the file
                if j == len(lines) - 1 and len(lines[j]) != 3:
                    # add a " before the end of the line
                    txt[j] = txt[j][:-1] + '"\n'
                    break

    # write the modified file to a temporary file and load it with np.loadtxt
    with tempfile.NamedTemporaryFile(mode="w", delete=False) as fname:
        fname.writelines(txt)
        tempFileName = fname.name

    lines = np.loadtxt(
        fname=tempFileName, skiprows=3, delimiter=",", quotechar='"', dtype=str
    )

    # remove the first column (we shall need only the timestamp and the marker to compare with the xdf file)
    lines = np.delete(lines, 0, 1)

    # return a list of [timestamp, marker] pairs
    out = []
    for line in lines:
        # CAUTION : the timestamp must be in seconds (as in xdf files)
        timestamp = float(line[0]) / 1000
        marker = line[1]
        out.append([timestamp, [marker]])

    return out


if doRunTests:

    xdf_and_csv_files_in_visit = get_xdf_and_csv_files_in_visit(visit_path)
    xdf_csv_files = xdf_and_csv_files_in_visit["xdf_csv_files"]
    mandatory_files_error = xdf_and_csv_files_in_visit["error"]
    expected_files_warning = xdf_and_csv_files_in_visit["warning"]

    if not mandatory_files_error:
        for xdf_csv in xdf_csv_files:
            for csv in xdf_csv["csv"]:
                if "_k_m" in csv or "_l_m" in csv:
                    print(f"\n----->>>>>>  Reading {csv}")
                    fnameMarkerCSV = os.path.join(visit_path, csv)
                    lines = readMarkerCsv(fnameMarkerCSV)
                    # print the lines
                    for line in lines:
                        print(line)
    else:
        print("At least one mandatory file is missing: \n" + mandatory_files_error)

## Create a list of [timestamp, marker] pairs from all mouse csv files corresponding to one xdf file

A single list of [timestamp, marker] pairs from all mouse csv files corresponding to one xdf file should correspond to the mouse marker stream in the xdf file.

In [None]:
def get_mouse_marker_list(xdf_csv):
    """Get the mouse markers from the csv files list and return them as a list of [timestamp, marker] pairs"""

    mouse_marker_list = []

    for csv in xdf_csv:
        if "_l_m" in csv:
            fnameMarkerCSV = os.path.join(visit_path, csv)
            lines = readMarkerCsv(fnameMarkerCSV)
            mouse_marker_list.extend(lines)

    # sort the markers by timestamp
    mouse_marker_list.sort(key=lambda x: x[0])

    return mouse_marker_list


def get_mouse_marker_list_for_reach_and_circle(xdf_csv_files, mandatory_files_error):
    """Get the mouse marker list for reaching and circle tasks"""

    mouse_markers_r = []
    mouse_markers_c = []

    if not mandatory_files_error:
        for xdf_csv in xdf_csv_files:
            if xdf_csv["xdf"].endswith("_r.xdf"):
                mouse_markers_r = get_mouse_marker_list(xdf_csv["csv"])
            if xdf_csv["xdf"].endswith("_c.xdf"):
                mouse_markers_c = get_mouse_marker_list(xdf_csv["csv"])

    return mouse_markers_r, mouse_markers_c


if doRunTests:

    xdf_and_csv_files_in_visit = get_xdf_and_csv_files_in_visit(visit_path)
    xdf_csv_files = xdf_and_csv_files_in_visit["xdf_csv_files"]
    mandatory_files_error = xdf_and_csv_files_in_visit["error"]
    expected_files_warning = xdf_and_csv_files_in_visit["warning"]

    mouse_markers_r, mouse_markers_c = get_mouse_marker_list_for_reach_and_circle(
        xdf_csv_files, mandatory_files_error
    )

    if mandatory_files_error:
        print("At least one mandatory file is missing: \n" + mandatory_files_error)

    print(f"\nMouse markers reaching ({len(mouse_markers_r)} markers):")
    for marker in mouse_markers_r:
        print(marker)

    print(f"\nMouse markers circle ({len(mouse_markers_c)} markers):")
    for marker in mouse_markers_c:
        print(marker)

## Read the xdf file and get the kinect markers and mouse markers streams



In [None]:
import os
import pyxdf


def load_xdf_kinect_mouse_streams(xdf_fullFname):
    """Load the xdf file and return the kinect and mouse streams"""

    extension = os.path.splitext(xdf_fullFname)[1]
    if extension != ".xdf":
        return []  # empty list if the file is not an xdf file

    # NOTE: There is a bug in the kinect markers stream: the csv is not exactly the same...
    #       So we shall use the MoCap stream to get the delay (
    #       hopefully it is the same, and there is a kinect MoCap csv file for each xdf file

    # NOTE: do not forget to synchronize the clocks for all streams
    data, header = pyxdf.load_xdf(
        filename=xdf_fullFname,
        select_streams=[
            {"type": "MoCap", "name": "EuroMov-Mocap-Kinect"},
            {"type": "Markers", "name": "EuroMov-Markers-Kinect"},
            {"type": "Markers", "name": "Mouse"},
        ],
        synchronize_clocks=True,
        dejitter_timestamps=False,  # to get the raw timestamps to compare with the CSV
    )
    kinect_mocap = [
        stream for stream in data if stream["info"]["name"][0] == "EuroMov-Mocap-Kinect"
    ][0]
    kinect_markers = [
        stream
        for stream in data
        if stream["info"]["name"][0] == "EuroMov-Markers-Kinect"
    ][0]
    mouse_markers = [stream for stream in data if stream["info"]["name"][0] == "Mouse"][
        0
    ]

    return kinect_mocap, kinect_markers, mouse_markers


def print_markers_stream(stream):
    """Print a markers stream"""

    n_lines = len(stream["time_series"])
    name = stream["info"]["name"][0]
    print(f"\n{name} ({n_lines} markers):")
    for i in range(n_lines):
        print(f"{stream['time_stamps'][i]:.3f}, {stream['time_series'][i]}")

    return


def marker_list(stream):
    """Return a list of markers from a markers stream"""

    markers_list = []
    n_lines = len(stream["time_series"])
    for i in range(n_lines):
        markers_list.append([stream["time_stamps"][i], stream["time_series"][i]])

    return markers_list


def get_all_csv_marker_of_type_list(csv_rel_fnames, marker_name_pattern):
    """Get a list of all markers from the csv files of marker_name_pattern type sorted by timestamp"""

    markers_csv_fname_list = [
        csv for csv in csv_rel_fnames if marker_name_pattern in csv
    ]

    markers_csv_list = []
    for markers_csv_fname in markers_csv_fname_list:
        markers_csv = readMarkerCsv(os.path.join(visit_path, markers_csv_fname))
        for i in range(len(markers_csv)):
            markers_csv_list.append(markers_csv[i])

    # sort markers_csv_list by timestamp (first column)
    markers_csv_list.sort(key=lambda x: x[0])

    return markers_csv_list


def get_marker_csv_all_list(csv_rel_fnames):
    """Get a list of all markers from the csv files sorted by timestamp"""

    markers_csv_fname_list = [
        csv for csv in csv_rel_fnames if "_k_m" in csv or "_l_m" in csv
    ]

    markers_csv_list = []
    for markers_csv_fname in markers_csv_fname_list:
        markers_csv = readMarkerCsv(os.path.join(visit_path, markers_csv_fname))
        for i in range(len(markers_csv)):
            markers_csv_list.append(markers_csv[i])

    # sort markers_csv_list by timestamp (first column)
    markers_csv_list.sort(key=lambda x: x[0])

    return markers_csv_list


def print_marker_csv_files_by_file(csv_rel_fnames, marker_name_pattern):
    """Print all marker csv files, file by file"""

    # print the corresponding markers csv files
    markers_csv_fname_list = [
        csv for csv in csv_rel_fnames if marker_name_pattern in csv
    ]

    for markers_csv_fname in markers_csv_fname_list:
        markers_csv = readMarkerCsv(os.path.join(visit_path, markers_csv_fname))

        n_lines = len(markers_csv)
        print(f"\n{markers_csv_fname} ({n_lines}):")
        for i in range(n_lines):
            print(f"{markers_csv[i][0]:.3f}, {markers_csv[i][1]}")

    return


def print_marker_list(markers_list, title=""):
    """Print a list of markers"""

    n_lines = len(markers_list)
    print(f"\n{title} ({len(markers_list)}):")
    for i in range(n_lines):
        print(f"{markers_list[i][0]:.3f}, {markers_list[i][1]}")

    return


if doRunTests:
    ##########################################################################################
    xdf_csv_file = xdf_csv_files[0]
    # xdf_csv_file = xdf_csv_files[1]
    ##########################################################################################

    xdf_rel_fname = xdf_csv_file["xdf"]  # only one xdf file
    csv_rel_fnames = xdf_csv_file["csv"]  # list of csv files
    kinect_mocap, kinect_markers, mouse_marker_list = load_xdf_kinect_mouse_streams(
        os.path.join(visit_path, xdf_rel_fname)
    )

    # print the xdf file name
    print(f"\nXDF file: {xdf_rel_fname}")

    print_markers_stream(kinect_markers)
    print_markers_stream(mouse_marker_list)
    print_marker_csv_files_by_file(csv_rel_fnames, "_l_m")  # mouse markers
    print_marker_csv_files_by_file(csv_rel_fnames, "_k_m")  # kinect markers

    # mouse_markers_csv_all_list = get_type_marker_csv_all_list(csv_rel_fnames, "_l_m")
    # kinect_markers_csv_all_list = get_type_marker_csv_all_list(csv_rel_fnames, "_k_m")
    k_m_markers_csv_all_list = get_marker_csv_all_list(csv_rel_fnames)

    print_marker_list(
        k_m_markers_csv_all_list, "Sorted markers from csv files (kinect and mouse) "
    )

## Compute the kinect-to-csv delay

We want to compute the delay between the kinect stream and the kinect csv file. 
This case is the simplest, as we have only one `*_k_m*.csv` file corresponding to only one kinect stream.  
Both contain the same markers sequences, but with different timestamps, and maybe with different markers in the beg and the end of the sequences (due to different start-stop for the xdf and csv records).

The steps are:
- Define the shortest and the longest marker list from the two marker lists.
- Find the (first) occurrence of the shortest marker list in the longest marker list. 
- Compute the delay as the difference between the timestamps of the two occurrences.


In [None]:
from os import error, times


def define_shortest_longest_lists(list1, list2):
    """Define the shortest and the longest list"""

    shortest_list = list1
    longest_list = list2
    if len(list2) < len(list1):
        shortest_list = list2
        longest_list = list1

    return shortest_list, longest_list


def find_first_occurrence_of_short_list_in_long_list(shortest_list, longest_list):
    """Find the first occurrence of the shortest list in the longest list
    the lists are lists of markers, hence comparison is on list[i][1] (the marker)"""

    if do_debug:
        # save the shortest list to a file for debugging
        with open("shortest_list.txt", "w") as f:
            for item in shortest_list:
                marker = item[1]
                f.write(f"{marker}\n")

        # save the longest list to a file for debugging
        with open("longest_list.txt", "w") as f:
            for item in longest_list:
                marker = item[1]
                f.write(f"{marker}\n")

    i_beg = -1
    i_end = -1
    error_msg = ""
    for i in range(len(longest_list)):
        longest_list_marker = longest_list[i][1]
        shortest_list_marker = shortest_list[0][1]
        start_found = (longest_list_marker == shortest_list_marker) and i_beg == -1
        if start_found:
            i_beg = i
            nb_j = len(shortest_list)
            for j in range(len(shortest_list)):
                if longest_list[i + j][1] == shortest_list[j][1]:
                    i_end = i + j
                # if we are at the end of the list, we have found the end
                if j == len(shortest_list) - 1:
                    break

    if i_beg == -1:
        error_msg += "The shortest list is not found in the longest list\n"
    if i_end == -1:
        error_msg += "The end of the shortest list is not found in the longest list\n"

    return i_beg, i_end, error_msg


def get_timestamps_differences(longest_list, shortest_list, i_beg, i_end):
    """Get the differences between the timestamps of the common part and the shortest list"""

    timestamps_common_part = [x[0] for x in longest_list[i_beg : i_end + 1]]
    timestamps_shortest_list = [x[0] for x in shortest_list]

    timestamps_differences = [
        timestamps_common_part[i] - timestamps_shortest_list[i]
        for i in range(len(shortest_list))
    ]

    # NOTE: timestamps_differences MUST be positive values only.
    # This is because csv time is UNIX time (seconds since 1970) and
    # xdf time is in seconds since the start of the recording (or something like that).

    timestamps_differences = [abs(x) for x in timestamps_differences]

    return timestamps_differences


def get_xdf_to_csv_kinect_marker_delay_dict(xdf_rel_fname, csv_rel_fnames):
    """Get the dictionary of delays between the kinect markers from the xdf file and the csv files
    the dictionary is {error, timestamps_differences}"""

    kinect_mocap, kinect_markers, mouse_markers = load_xdf_kinect_mouse_streams(
        os.path.join(visit_path, xdf_rel_fname)
    )

    kinect_markers_csv_all_list = get_all_csv_marker_of_type_list(
        csv_rel_fnames, "_k_m"
    )

    kinect_markers_xdf_list = marker_list(kinect_markers)

    shortest_list, longest_list = define_shortest_longest_lists(
        kinect_markers_xdf_list, kinect_markers_csv_all_list
    )

    i_beg, i_end, error = find_first_occurrence_of_short_list_in_long_list(
        shortest_list, longest_list
    )

    if error:
        return {
            "error_msg": error,
            "timestamps_differences": [],
        }

    timestamps_differences = get_timestamps_differences(
        longest_list, shortest_list, i_beg, i_end
    )

    return {
        "error_msg": error,
        "timestamps_differences": timestamps_differences,
    }


if doRunTests:
    # ##########################################################################################
    # kinect_mocap, kinect_markers, mouse_markers = load_xdf_kinect_mouse_streams(
    #     os.path.join(visit_path, xdf_rel_fname)
    # )

    # kinect_markers_csv_list = get_all_csv_marker_of_type_list(csv_rel_fnames, "_k_m")
    # print_marker_list(kinect_markers_csv_list, "Sorted kinect markers from csv files")

    # kinect_markers_xdf_list = marker_list(kinect_markers)
    # print_marker_list(kinect_markers_xdf_list, "Kinect markers from " + xdf_rel_fname)

    # shortest_list, longest_list = define_shortest_longest_lists(
    #     kinect_markers_csv_list, kinect_markers_xdf_list
    # )

    # i_beg, i_end, error = find_first_occurrence_of_short_list_in_long_list(
    #     shortest_list, longest_list
    # )

    # if error:
    #     print("An error occurred: process stopped")
    #     print(error)
    # else:
    #     timestamps_difference = get_timestamps_differences(
    #         longest_list, shortest_list, i_beg, i_end
    #     )

    #     print(f"\nFirst occurrence of the shortest list in the longest list: {i_beg}")
    #     print(f"Last occurrence of the shortest list in the longest list: {i_end}")

    #     # print the common part of the longest list
    #     common_part = longest_list[i_beg : i_end + 1]
    #     print_marker_list(common_part, "Common part of the longest list")

    #     # print the timestamps of the common part
    #     timestamps_common_part = [x[0] for x in common_part]
    #     print(f"\nTimestamps of the common part: {timestamps_common_part}")

    #     # print the timestamps of the shortest list
    #     timestamps_shortest_list = [x[0] for x in shortest_list]
    #     print(f"\nTimestamps of the shortest list: {timestamps_shortest_list}")

    #     # get the difference between the timestamps of the common part and the shortest list
    #     timestamps_diff = [
    #         timestamps_common_part[i] - timestamps_shortest_list[i]
    #         for i in range(len(shortest_list))
    #     ]

    #     if longest_list == kinect_markers_csv_list:
    #         for i in range(len(timestamps_diff)):
    #             timestamps_diff[i] = -timestamps_diff[i]
    #         print(
    #             "\nShortest list: kinect markers from xdf: reversed sign of the timestamps difference"
    #         )

    #     print(
    #         f"\nDifference between the timestamps of the common part and the shortest list: {timestamps_diff}"
    #     )
    #     # print the mean and the standard deviation of the timestamps difference
    #     timestamps_diff_mean = np.mean(timestamps_diff)
    #     timestamps_diff_std = np.std(timestamps_diff)
    #     print(f"\nMean of the timestamps difference: {timestamps_diff_mean:.3f}")
    #     print(
    #         f"Standard deviation of the timestamps difference: {timestamps_diff_std:.3f}"
    #     )

    ktc_delays = get_xdf_to_csv_kinect_marker_delay_dict(xdf_rel_fname, csv_rel_fnames)
    print(f"\nKinect to csv delays: {ktc_delays}")

## Compute the mouse-to-csv delay

We want to compute the delay between the mouse stream and the mouse csv files.  
This case is more complex, as we have multiple `*_l_m*.csv` files corresponding to only one the mouse stream,and maybe with different markers in the beg and the end of the sequences (due to different start-stop for the xdf and csv records).

The steps are:
- Define the shortest and the longest marker list from the two marker lists.
- Find the (first) occurrence of the shortest marker list in the longest marker list. 
- Compute the delay as the difference between the timestamps of the two occurrences.

In [None]:
def get_xdf_to_csv_mouse_marker_delay_dict(xdf_rel_fname, csv_rel_fnames):
    """Get the dictionary of the delays between the xdf and the mouse csv markers.
    The dictionary is {error: error_msg, timestamps_differences: timestamps_differences}
    """

    kinect_mocap, kinect_markers, mouse_markers = load_xdf_kinect_mouse_streams(
        os.path.join(visit_path, xdf_rel_fname)
    )

    mouse_markers_csv_list = get_all_csv_marker_of_type_list(csv_rel_fnames, "_l_m")
    mouse_markers_xdf_list = marker_list(mouse_markers)

    shortest_list, longest_list = define_shortest_longest_lists(
        mouse_markers_xdf_list, mouse_markers_csv_list
    )

    i_beg, i_end, error_msg = find_first_occurrence_of_short_list_in_long_list(
        shortest_list, longest_list
    )

    if error_msg:
        return {
            "error_msg": error_msg,
            "timestamps_differences": [],
        }

    timestamps_differences = get_timestamps_differences(
        longest_list, shortest_list, i_beg, i_end
    )

    return {
        "error_msg": error_msg,
        "timestamps_differences": timestamps_differences,
    }


if doRunTests:
    kinect_mocap, kinect_markers, mouse_markers = load_xdf_kinect_mouse_streams(
        os.path.join(visit_path, xdf_rel_fname)
    )

    mouse_markers_csv_list = get_all_csv_marker_of_type_list(csv_rel_fnames, "_l_m")
    print_marker_list(mouse_markers_csv_list, "Sorted mouse markers from csv files")

    mouse_markers_xdf_list = marker_list(mouse_marker_list)
    print_marker_list(mouse_markers_xdf_list, "mouse markers from xdf")

    shortest_list, longest_list = define_shortest_longest_lists(
        mouse_markers_csv_list, mouse_markers_xdf_list
    )

    i_beg, i_end, error_msg_find = find_first_occurrence_of_short_list_in_long_list(
        shortest_list, longest_list
    )

    if error_msg_find:
        print("An error occurred when finding list in list: process stopped")
        print(error_msg_find)
    else:

        timestamps_differences = get_timestamps_differences(
            shortest_list, longest_list, i_beg, i_end
        )

        print("")
        print(f"First occurrence of the shortest list in the longest list: {i_beg}")
        print(f"Last occurrence of the shortest list in the longest list: {i_end}")

        # print the common part of the longest list
        common_part = longest_list[i_beg : i_end + 1]
        print_marker_list(common_part, "Common part of the longest list")

        # print the timestamps of the common part
        timestamps_common_part = [x[0] for x in common_part]
        print(f"\nTimestamps of the common part: {timestamps_common_part}")

        # print the timestamps of the shortest list
        timestamps_shortest_list = [x[0] for x in shortest_list]
        print(f"\nTimestamps of the shortest list: {timestamps_shortest_list}")

        # get the difference between the timestamps of the common part and the shortest list
        timestamps_differences = [
            timestamps_common_part[i] - timestamps_shortest_list[i]
            for i in range(len(shortest_list))
        ]
        print(
            f"\nDifference between the timestamps of the common part and the shortest list: {timestamps_differences}"
        )
        # print the mean and the standard deviation of the timestamps difference
        timestamps_diff_mean = np.mean(timestamps_differences)
        timestamps_diff_std = np.std(timestamps_differences)
        print(f"\nMean of the timestamps difference: {timestamps_diff_mean:.3f}")
        print(
            f"Standard deviation of the timestamps difference: {timestamps_diff_std:.3f}"
        )

    mtc_delays = get_xdf_to_csv_mouse_marker_delay_dict(xdf_rel_fname, csv_rel_fnames)
    print(f"Mouse to csv delays: {mtc_delays}")

## Compute the kinect-to-mouse delay

In [None]:
def get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames):
    """Get the delay between the kinect and the mouse markers from the xdf and csv files"""

    kinect_to_cvs_delay_dict = get_xdf_to_csv_kinect_marker_delay_dict(
        xdf_rel_fname, csv_rel_fnames
    )
    mouse_to_csv_delay_dict = get_xdf_to_csv_mouse_marker_delay_dict(
        xdf_rel_fname, csv_rel_fnames
    )

    if kinect_to_cvs_delay_dict["error_msg"] or mouse_to_csv_delay_dict["error_msg"]:
        return {
            "kinect_to_cvs_delay_list": [],
            "mouse_to_csv_delay_list": [],
            "kinect_to_mouse_delay_mean": np.nan,
            "kinect_to_mouse_delay_std": np.nan,
            "error_msg": "An error occurred when computing the delays."
            + kinect_to_cvs_delay_dict["error_msg"]
            + mouse_to_csv_delay_dict["error_msg"],
        }

    kinect_to_cvs_delay_mean = np.mean(
        kinect_to_cvs_delay_dict["timestamps_differences"]
    )
    kinect_to_cvs_delay_std = np.std(kinect_to_cvs_delay_dict["timestamps_differences"])

    mouse_to_csv_delay_mean = np.mean(mouse_to_csv_delay_dict["timestamps_differences"])
    mouse_to_csv_delay_std = np.std(mouse_to_csv_delay_dict["timestamps_differences"])

    # NOTE: the kinect to mouse delay is the difference of the means
    kinect_to_mouse_delay_mean = kinect_to_cvs_delay_mean - mouse_to_csv_delay_mean

    # NOTE: the variance of the difference is the sum of the variances **minus the covariance**
    # (e.g. https://en.wikipedia.org/wiki/Propagation_of_uncertainty)
    # If we assume that the two distributions are independent, the covariance is zero
    # hence computing the variance of the difference as the sum of the variances is correct.
    # If we assume that the two distributions are not independent, we should subtract the covariance,
    # but we do not have it. We still can compute the variance of the difference as the sum of the variances,
    # and this will be an **upper bound of the variance of the difference**.
    kinect_to_mouse_delay_std = np.sqrt(
        kinect_to_cvs_delay_std**2 + mouse_to_csv_delay_std**2
    )
    return {
        "kinect_to_cvs_delay_list": kinect_to_cvs_delay_dict["timestamps_differences"],
        "mouse_to_csv_delay_list": mouse_to_csv_delay_dict["timestamps_differences"],
        "kinect_to_mouse_delay_mean": kinect_to_mouse_delay_mean,
        "kinect_to_mouse_delay_std": kinect_to_mouse_delay_std,
        "error_msg": "",
    }


def print_kinect_to_mouse_delay_dict(ktm_delays):
    """Print the kinect to mouse delay dictionary"""

    # we need this local import to avoid conflict with global import:
    # from datetime import datetime (for the function check_csv_date)

    from datetime import timedelta

    kinect_to_cvs_delay_list = ktm_delays["kinect_to_cvs_delay_list"]
    mouse_to_csv_delay_list = ktm_delays["mouse_to_csv_delay_list"]
    kinect_to_mouse_delay_mean = ktm_delays["kinect_to_mouse_delay_mean"]
    kinect_to_mouse_delay_std = ktm_delays["kinect_to_mouse_delay_std"]

    kinect_to_mouse_delay_mean_HMS = timedelta(seconds=abs(kinect_to_mouse_delay_mean))

    print(f"\nKinect to mouse delays: ")
    print(f"  - kinect to cvs: {kinect_to_cvs_delay_list}")
    print(f"  - mouse to cvs: {mouse_to_csv_delay_list}")
    print(
        f"  - kinect to mouse delay mean = {kinect_to_mouse_delay_mean:.3f} s ({kinect_to_mouse_delay_mean_HMS})"
    )
    print(f"  - kinect to mouse delay std < {kinect_to_mouse_delay_std:.6f} s")

    return


if doRunTests:

    k_delays = get_xdf_to_csv_kinect_marker_delay_dict(xdf_rel_fname, csv_rel_fnames)

    k_delays_diff = k_delays["timestamps_differences"]
    print(f"\nKinect to csv delays: {np.mean(k_delays_diff)}")

    m_delays = get_xdf_to_csv_mouse_marker_delay_dict(xdf_rel_fname, csv_rel_fnames)
    m_delays_diff = m_delays["timestamps_differences"]
    print(f"\nMouse to csv delays: {np.mean(m_delays_diff)}")

    ktm_delay = np.mean(k_delays_diff) - np.mean(m_delays_diff)
    print(f"\nKinect to mouse delay: {ktm_delay}")

    np.set_printoptions(precision=3)

    kinect_markers_timestamps = kinect_markers["time_stamps"]
    print(f"\nK ts     : {kinect_markers_timestamps}")

    new_kinect_markers_timestamps = kinect_markers["time_stamps"] - ktm_delay
    print(f"\nK ts new : {new_kinect_markers_timestamps}")

    mouse_markers_timestamps = mouse_marker_list["time_stamps"]
    print(f"\nM ts     : {mouse_markers_timestamps[:5]}...")

    # analysis of the delay distribution
    ################################################################################################
    ktm_delays = get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames)
    ################################################################################################
    kinect_to_cvs_delay_list = ktm_delays["kinect_to_cvs_delay_list"]
    mouse_to_csv_delay_list = ktm_delays["mouse_to_csv_delay_list"]
    # kinect_to_mouse_delay_mean = ktm_delays["kinect_to_mouse_delay_mean"]
    # kinect_to_mouse_delay_std = ktm_delays["kinect_to_mouse_delay_std"]

    print_kinect_to_mouse_delay_dict(ktm_delays)

    # make two boxplot of the distribution of the delays (kinect to cvs and mouse to cvs) in 2 figures
    # and add the values of each delay in the boxplot
    import matplotlib.pyplot as plt

    # make a boxplot of the centered distribution of the kinect and mouse delays on a single figure
    kinect_to_cvs_delay_list_centered = kinect_to_cvs_delay_list - np.mean(
        kinect_to_cvs_delay_list
    )
    mouse_to_csv_delay_list_centered = mouse_to_csv_delay_list - np.mean(
        mouse_to_csv_delay_list
    )

    fig, axs = plt.subplots(1, 1, figsize=(5, 5))
    axs.boxplot([kinect_to_cvs_delay_list_centered, mouse_to_csv_delay_list_centered])
    # add a dot for each value
    axs.scatter(
        [1] * len(kinect_to_cvs_delay_list_centered),
        kinect_to_cvs_delay_list_centered,
        color="blue",
        alpha=0.6,
    )
    axs.scatter(
        [2] * len(mouse_to_csv_delay_list_centered),
        mouse_to_csv_delay_list_centered,
        color="blue",
        alpha=0.6,
    )
    # add a horizontal line for the mean and label it with the mean value
    axs.axhline(
        np.mean(kinect_to_cvs_delay_list_centered), color="grey", linestyle="--"
    )
    axs.text(
        1.5,
        np.mean(kinect_to_cvs_delay_list_centered),
        f"mean",
        fontsize=12,
        color="grey",
        ha="center",
        va="center",
        backgroundcolor="w",
    )

    axs.set_title("Kinect to cvs and mouse to cvs delays")
    axs.set_xticklabels(["Kinect to cvs", "Mouse to cvs"])
    axs.set_ylabel("Delay - mean (s)")
    plt.show()

In [None]:
if doRunTests:

    ## Check how it goes for the kinect mocap stream

    kinect_mocap_timestamps = kinect_mocap["time_stamps"]
    print(f"\nKMocap ts    : {kinect_mocap_timestamps[:5]}...")
    new_kinect_mocap_timestamps = kinect_mocap["time_stamps"] - ktm_delay
    print(f"\nKMocap ts new: {new_kinect_mocap_timestamps[:5]}...")

## Compare the timestamps of the kinect mocap from the csv and from the xdf file

Sometimes, the kinect markers are not correctly recorded in the xdf file (due to a bug in LSL_Kinect)... 
Consequently, we cannot rely on the kinect markers only.  

Solution: 
- We can use the timestamps of the kinect mocap from the csv file to get the kinect-to-csv delay.   
- We can also verify that the kinect-to-csv delay from the markers is consistent with the kinect-to-csv delay from the mocap.



In [None]:
def create_kinect_mocap_xdf_list(kinect_mocap):
    """Create a list of [timestamp, markers] from the kinect mocap stream,
    where the marker is the first 5 columns values joined as a string
    """

    kinect_mocap_timeseries = kinect_mocap["time_series"]
    kinect_mocap_timestamps = kinect_mocap["time_stamps"]

    kinect_mocap_xdf_list = []
    for i in range(len(kinect_mocap_timestamps)):
        markers = [kinect_mocap_timeseries[i][j] for j in range(5)]
        markers = [str(marker) for marker in markers]
        markers = ",".join(markers)
        timestamp = kinect_mocap_timestamps[i]
        kinect_mocap_xdf_list.append([timestamp, markers])

    return kinect_mocap_xdf_list


if doRunTests:

    kinect_mocap_csv_list_xxx = get_all_csv_marker_of_type_list(
        csv_rel_fnames, "_k.csv"
    )
    kinect_mocap_xdf_list_xxx = marker_list(kinect_mocap)

    # verify that the lists are the same length
    if len(kinect_mocap_csv_list_xxx) != len(kinect_mocap_xdf_list_xxx):
        print("The lists are not of the same length")

    # decode a kinect mocap stream from the xdf file
    kinect_mocap_timeseries = kinect_mocap["time_series"]
    kinect_mocap_timestamps = kinect_mocap["time_stamps"]
    # get the names of the first 5 columns from the kinect_mocap['info']['desc'][0]['channels']
    kinect_mocap_channel_names_list = kinect_mocap["info"]["desc"][0]["channels"][0][
        "channel"
    ]

    print(f"\nKinect mocap stream labels: ")
    for i in range(5):
        label = kinect_mocap_channel_names_list[i]["label"][0]
        print(f"{i}: {label}")

    ################################################################################################
    kinect_mocap_xdf_list = create_kinect_mocap_xdf_list(kinect_mocap)

    # print the first 5 markers
    print(f"\nKinect mocap markers: ")
    for i in range(5):
        print(f"{i}: {kinect_mocap_xdf_list[i]}")

In [None]:
def create_kinect_mocap_csv_list(csv_rel_fnames):
    """Create a list of [timestamp, markers] from the kinect mocap csv file,
    where the marker is the first 5 columns values joined as a string
    """

    kinect_mocap_csv_fname = [csv for csv in csv_rel_fnames if "_k.csv" in csv]
    kinect_mocap_csv_fname = kinect_mocap_csv_fname[0]
    kinect_mocap_csv_fname = os.path.join(visit_path, kinect_mocap_csv_fname)

    with open(kinect_mocap_csv_fname, "r") as fname:
        txt = fname.readlines()

    lines = [line.split(",") for line in txt]

    # remove the first 3 lines (header)
    lines = lines[3:]

    # keep only the first 5 columns
    kinect_mocap_csv_list = []
    for line in lines:
        markers = [line[j] for j in range(5)]
        markers = [float(marker) for marker in markers]  # float (as in xdf)
        markers = [str(marker) for marker in markers]
        markers = ",".join(markers)
        timestamp = float(line[0]) / 1000
        kinect_mocap_csv_list.append([timestamp, markers])

    return kinect_mocap_csv_list


if doRunTests:
    ###########################################################################################
    kinect_mocap_csv_list = create_kinect_mocap_csv_list(csv_rel_fnames)

    # print the first 5 markers
    print(f"\nKinect mocap markers from csv: ")
    for i in range(5):
        print(f"{i}: {kinect_mocap_csv_list[i]}")

### Verify that the kinect-to-csv delay from the markers is consistent with the kinect-to-csv delay from the mocap



In [None]:
def get_kinect_mocap_csv_fullfname(xdf_csv_files):
    """Get the full filename of the kinect mocap csv file"""

    kinect_mocap_csv_fname = [csv for csv in xdf_csv_files if "_k.csv" in csv]
    if len(kinect_mocap_csv_fname) != 1:
        raise ValueError(
            "There should be only one kinect mocap csv file (with '_k.csv' in the name)"
        )

    kinect_mocap_csv_fname = kinect_mocap_csv_fname[0]
    kinect_mocap_csv_fname = os.path.join(visit_path, kinect_mocap_csv_fname)

    return kinect_mocap_csv_fname


def verify_kinect_to_csv_mocap_delay(xdf_rel_fname, csv_rel_fnames):
    """Verify that the kinect-to-csv delay from the markers is consistent with the kinect-to-csv delay from the mocap"""

    kinect_mocap_csv_full_fname = get_kinect_mocap_csv_fullfname(csv_rel_fnames)

    kinect_mocap_xdf_list = create_kinect_mocap_xdf_list(kinect_mocap)
    kinect_mocap_csv_list = create_kinect_mocap_csv_list(csv_rel_fnames)

    # verify that the lists are the same length
    # NOTE: this should not happen, but it is better to check...

    kinect_mocap_timestamps_diff_mean = np.nan
    kinect_mocap_timestamps_diff_std = np.nan
    warning_msg = ""
    error_msg = ""

    kinect_mocap_xdf_list_ok = kinect_mocap_xdf_list
    kinect_mocap_csv_list_ok = kinect_mocap_csv_list
    if len(kinect_mocap_csv_list) != len(kinect_mocap_xdf_list):
        warning_msg += "\n" + (
            f"Different length: mocap-cvs={len(kinect_mocap_csv_list)} vs mocap-xdf={len(kinect_mocap_xdf_list)}"
        )
        # print the filename of the csv file and the xdf file
        warning_msg += "\n" + (f"csv: {kinect_mocap_csv_full_fname}")
        warning_msg += "\n" + (f"xdf: {xdf_rel_fname}")

        shortest_list, longest_list = define_shortest_longest_lists(
            kinect_mocap_xdf_list, kinect_mocap_csv_list
        )
        i_beg, i_end, error_in_find = find_first_occurrence_of_short_list_in_long_list(
            shortest_list, longest_list
        )

        warning_msg += "\n" + (
            f"First occurrence of the shortest list in the longest list: {i_beg}"
        )
        warning_msg += "\n" + (
            f"Last occurrence of the shortest list in the longest list: {i_end}"
        )

        if shortest_list == kinect_mocap_xdf_list:
            kinect_mocap_csv_list_ok = kinect_mocap_csv_list[i_beg : i_end + 1]
            kinect_mocap_xdf_list_ok = kinect_mocap_xdf_list
        else:
            kinect_mocap_xdf_list_ok = kinect_mocap_xdf_list[i_beg : i_end + 1]
            kinect_mocap_csv_list_ok = kinect_mocap_csv_list

        if len(kinect_mocap_xdf_list_ok) != len(kinect_mocap_csv_list_ok):
            # this should not happen... but... just in case...
            error_msg += "\n" + (
                f"Could not find a partial match between mocap-cvs and mocap-xdf lists"
            )
            error_msg += "\n" + (
                f"mocap-cvs={len(kinect_mocap_csv_list_ok)} vs mocap-xdf={len(kinect_mocap_xdf_list_ok)}"
            )

    if not error_msg:

        kinect_mocap_xdf_timestamps = [x[0] for x in kinect_mocap_xdf_list_ok]
        kinect_mocap_csv_timestamps = [x[0] for x in kinect_mocap_csv_list_ok]
        # get the differences between the timestamps
        kinect_mocap_timestamps_diff = [
            kinect_mocap_xdf_timestamps[i] - kinect_mocap_csv_timestamps[i]
            for i in range(len(kinect_mocap_csv_timestamps))
        ]

        # get the mean and the standard deviation of the differences
        kinect_mocap_timestamps_diff_mean = np.mean(kinect_mocap_timestamps_diff)
        kinect_mocap_timestamps_diff_std = np.std(kinect_mocap_timestamps_diff)

        warning_msg += "\n" + (
            f"\nkinect_mocap_timestamps_diff_mean: {kinect_mocap_timestamps_diff_mean:.3f}ss"
        )
        warning_msg += "\n" + (
            f"    Standard deviation: {kinect_mocap_timestamps_diff_std:.6f} s"
        )

        warning_msg += "\n" + (
            f"\nkinect_markers_timestamps_diff_mean: {np.mean(k_delays['timestamps_differences']):.3f} s"
        )
        warning_msg += "\n" + (
            f"    Standard deviation: {np.std(k_delays['timestamps_differences']):.6f} s"
        )

        marker_mocap_timestamps_diff_mean = (
            np.mean(k_delays["timestamps_differences"])
            - kinect_mocap_timestamps_diff_mean
        )
        warning_msg += "\n" + (
            f"\nDifference between marker and mocap delays: {marker_mocap_timestamps_diff_mean:.6f} s"
        )
        if marker_mocap_timestamps_diff_mean < 0.001:
            warning_msg += "\n" + ("The difference is less than 1 ms :-)")
        else:
            warning_msg += "\n" + (
                "WARNING: The difference is greater than 1 ms, although smaller than 100 ms :-( "
            )
        if marker_mocap_timestamps_diff_mean > 0.1:
            warning_msg += "\n" + "The difference is greater than 100 ms\n"

    return {
        "kinect_mocap_timestamps_diff_mean": kinect_mocap_timestamps_diff_mean,
        "kinect_mocap_timestamps_diff_std": kinect_mocap_timestamps_diff_std,
        "error_msg": error_msg,
        "warning_msg": warning_msg,
    }


def print_verify_kinect_to_csv_mocap_delay(verify_result):
    """Print the verification result of the kinect to csv delay by comparing the kinect mocap in xdf and csv files"""

    kinect_mocap_timestamps_diff_mean = verify_result[
        "kinect_mocap_timestamps_diff_mean"
    ]
    kinect_mocap_timestamps_diff_std = verify_result["kinect_mocap_timestamps_diff_std"]
    error_msg = verify_result["error_msg"]
    warning_msg = verify_result["warning_msg"]

    print("")
    print(
        "******BEGIN: Verification of kinect to csv delay comparing the kinect mocap in xdf and csv files:"
    )
    print(f"Kinect mocap timestamps diff mean: {kinect_mocap_timestamps_diff_mean:.3f}")
    print(f"Kinect mocap timestamps diff std : {kinect_mocap_timestamps_diff_std:.6f}")
    print("--> Warnings: " + warning_msg)
    print("--> Error: " + error_msg)
    print(
        f"********END: Verification of kinect to csv delay comparing the kinect mocap in xdf and csv files"
    )
    return


if doRunTests:
    ###########################################################################################
    verify_result = verify_kinect_to_csv_mocap_delay(xdf_rel_fname, csv_rel_fnames)
    print_verify_kinect_to_csv_mocap_delay(verify_result)

In [None]:
def check_csv_date(fullFname, msg=""):
    """
    Check the date of a csv file
    The date is extracted from the first timestamp in the file
    """
    # date is set to 1970-01-01 00:00:00 if not found
    # date = datetime.fromtimestamp(0, tz=None)
    # date is set to None if not found
    date = None

    if not fullFname.endswith(".csv"):
        msg += f"{fullFname} is not a csv file."
        return date, msg

    # The timestamps are always in column 1, after 3-4 lines of header
    # but it can be a string or a float (in milliseconds)
    data = np.loadtxt(fullFname, skiprows=4, delimiter=",", max_rows=5, dtype=str)

    # check if the file is empty
    if data.size == 0:
        basename = os.path.basename(fullFname)
        msg += f"{basename} is an empty file."
        return date, msg
    # check if we get the expected number of columns
    if data.shape[1] < 2:
        basename = os.path.basename(fullFname)
        msg += f"{basename} has less than 2 columns."
        return date, msg

    # Sounds good, we have some data to check
    # we only need the first timestamp
    timestamp = data[0, 0]
    try:
        timestamp = float(timestamp) / 1000  # in seconds
    except:
        pass
    if isinstance(timestamp, float):
        date = datetime.fromtimestamp(timestamp, tz=None)
    if isinstance(timestamp, str):
        date = datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S.%f")
    return date, msg


def sort_csv_files_by_date(csv_files):
    """Sort the csv files by date inside the file"""

    csv_files_sorted = []
    for csv_file in csv_files:
        csv_full_fname = os.path.join(visit_path, csv_file)
        date, msg = check_csv_date(csv_full_fname)
        if date is not None:
            csv_files_sorted.append([date, csv_file, msg])

    csv_files_sorted.sort(key=lambda x: x[0])

    # get the time difference between the first file and the other files
    t_zero_sec = csv_files_sorted[0][0].timestamp()
    diff_time = []
    diff_time.append(t_zero_sec - t_zero_sec)
    for csv_file in csv_files_sorted:
        timestamp_sec = csv_file[0].timestamp()
        diff_time = np.append(diff_time, timestamp_sec - t_zero_sec)

    for i in range(len(csv_files_sorted)):
        csv_files_sorted[i].append(diff_time[i])

    return csv_files_sorted


def print_csv_files_by_date(csv_files_sorted):
    """Print the csv files sorted by date"""

    # we need this local import to avoid conflict with global import:
    # from datetime import datetime (for the function check_csv_date)

    from datetime import timedelta

    print("\nCSV files sorted by date (date inside the file):")
    for csv_file in csv_files_sorted:
        date = csv_file[0]
        csv_fname = csv_file[1]
        msg = csv_file[2]
        diff_time = csv_file[3]
        diff_time_HMS = timedelta(seconds=diff_time)
        diff_time_HMS_str = str(diff_time_HMS)
        # remove the last 3 characters (milliseconds)
        diff_time_HMS_str = diff_time_HMS_str[:-3]
        # print(f"{date}: ({diff_time:11.3f}) {csv_fname} {msg}")
        print(f"{date}: (+{diff_time_HMS_str}) {csv_fname} {msg}")

    return

## Compute the kinect-to-mouse delay for one xdf file

In [None]:
def get_kinect_to_mouse_delay_for_xdf_file(xdf_rel_fname, csv_rel_fnames):
    """Get the kinect to mouse delay for an xdf file and the corresponding csv files"""

    ktm_delays = get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames)

    if ktm_delays["error_msg"]:
        return {
            "error_msg": ktm_delays["error_msg"],
            "warning_msg": "",
            "ktm_delays": [],
        }

    # verify the kinect to csv delay by comparing the kinect mocap in xdf and csv files
    verify_result = verify_kinect_to_csv_mocap_delay(xdf_rel_fname, csv_rel_fnames)

    return {
        "error_msg": verify_result["error_msg"],
        "warning_msg": verify_result["warning_msg"],
        "ktm_delays": ktm_delays,
    }


xdf_and_csv_files_in_visit = get_xdf_and_csv_files_in_visit(visit_path)
xdf_csv_files = xdf_and_csv_files_in_visit["xdf_csv_files"]
mandatory_files_error = xdf_and_csv_files_in_visit["error"]
expected_files_warning = xdf_and_csv_files_in_visit["warning"]

# get one xdf file and the corresponding csv files
xdf_csv_file = xdf_csv_files[0]

print(f"\nVisit path: {visit_path}")
print(
    f"xdf_csv_file: \nxdf: {xdf_csv_file['xdf']} and {len(xdf_csv_file['csv'])} csv files"
)
for csv in xdf_csv_file["csv"]:
    print(f"csv: {csv}")
print(f"\nMandatory files error: {mandatory_files_error}")
print(f"\nExpected files warning: {expected_files_warning}")


xdf_rel_fname = xdf_csv_file["xdf"]  # only one xdf file
csv_rel_fnames = xdf_csv_file["csv"]  # list of csv files

if not mandatory_files_error:  # if there is no error
    # get the kinect to mouse delay
    ktm_delays = get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames)
    print_kinect_to_mouse_delay_dict(ktm_delays)
    # verify_kinect_to_csv_delay(xdf_rel_fname, csv_rel_fnames)


# get the whole list of csv files
csv_rel_fnames_all = []
for xdf_csv_file in xdf_csv_files:
    csv_rel_fnames_all.extend(xdf_csv_file["csv"])

# sort the csv files by date
csv_files_sorted = sort_csv_files_by_date(csv_rel_fnames_all)
print_csv_files_by_date(csv_files_sorted)


# ReArm_C1P02_20210306_V1_Reaching/ReArm_C1P02_20210322_V1_r.xdf and 6 csv files -21495.123 s --- rkm 2021-03-22 14:58:53
# ReArm_C1P02_20210306_V1_Circle/ReArm_C1P02_20210322_V1_c.xdf and 6 csv files : -22851.839 s --- ckm 2021-03-22 15:18:03

In [None]:
Expected_order = [
    "_r_k_m.csv",
    "_r_k.csv",
    "_r_l_m_sau_p.csv",
    "_r_l_m_sau_np.csv",
    "_r_l_m_mau_p.csv",
    "_r_l_m_mau_np.csv",
    "_c_k_m.csv",
    "_c_k.csv",
    "_c_l_m_p.csv",
    "_c_l_p.csv",
    "_c_l_m_np.csv",
    "_c_l_np.csv",
]


def is_in_expected_order(csv_files_sorted, expected_order):
    """Check if the csv files are in the expected order"""

    csv_files_sorted_names = [csv_file[1] for csv_file in csv_files_sorted]
    for i, csv_file in enumerate(csv_files_sorted_names):
        # if the csv files does not ends with the expected order, return False
        if not csv_file.endswith(expected_order[i]):
            return False
    return True


if doRunTests:
    if is_in_expected_order(csv_files_sorted, Expected_order):
        print("The csv files are in the expected order")
    else:
        print("The csv files are NOT in the expected order: ")

        csv_files_sorted_names = [csv_file[1] for csv_file in csv_files_sorted]
        for i, csv_file in enumerate(csv_files_sorted_names):
            print(
                f"{i}: {csv_file} -- {csv_file.endswith(Expected_order[i])} -- {Expected_order[i]}"
            )

## Compute the kinect-to-mouse delay for all xdf files in the visit directory



In [None]:
def get_kinect_to_mouse_delay_for_all_xdf_in_visit(visit_path):
    """Compute the kinect-to-mouse delay for all xdf files in the visit directory"""

    xdf_and_csv_files_in_visit = get_xdf_and_csv_files_in_visit(visit_path)
    xdf_csv_files = xdf_and_csv_files_in_visit["xdf_csv_files"]

    # initialize the return dictionary
    mandatory_files_error = xdf_and_csv_files_in_visit["error"]
    expected_files_warning = xdf_and_csv_files_in_visit["warning"]
    xdf_delays = []
    msg = ""

    for xdf_csv_file in xdf_csv_files:
        xdf_rel_fname = xdf_csv_file["xdf"]
        csv_rel_fnames = xdf_csv_file["csv"]
        msg += "\n" + (f"Visit path: {visit_path}")
        msg += "\n" + (
            f"xdf_csv_file: \nxdf: {xdf_rel_fname} and {len(csv_rel_fnames)} csv files"
        )
        for csv in csv_rel_fnames:
            msg += "\n" + (f"csv: {csv}")

        msg += "\n" + (f"\nMandatory files error: {mandatory_files_error}")
        msg += "\n" + (f"\nExpected files warning: {expected_files_warning}")

        if not mandatory_files_error:
            ktm_delays = get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames)
            # print_kinect_to_mouse_delay_dict(ktm_delays)
            verify = verify_kinect_to_csv_mocap_delay(xdf_rel_fname, csv_rel_fnames)
            # print_verify_kinect_to_csv_mocap_delay(verify)

            if not ktm_delays["error_msg"]:
                xdf_delays.append(
                    {
                        "xdf": xdf_rel_fname,
                        "kinect_to_mouse_delay": ktm_delays[
                            "kinect_to_mouse_delay_mean"
                        ],
                        "kinect_to_mouse_delay_std": ktm_delays[
                            "kinect_to_mouse_delay_std"
                        ],
                        "kinect_to_mouse_delay_list": ktm_delays[
                            "kinect_to_cvs_delay_list"
                        ],
                        "mouse_to_csv_delay_list": ktm_delays[
                            "mouse_to_csv_delay_list"
                        ],
                    }
                )

    return {
        "error_msg": mandatory_files_error,
        "warning_msg": expected_files_warning,
        "xdf_delays": xdf_delays,
    }


if doRunTests:

    kinect_to_mouse_delay_for_all_xdf_in_visit = (
        get_kinect_to_mouse_delay_for_all_xdf_in_visit(visit_path)
    )
    for xdf_delay in kinect_to_mouse_delay_for_all_xdf_in_visit["xdf_delays"]:
        print(
            "******************************************************************************************************"
        )
        print(f"{xdf_delay['xdf']}:")
        print(
            f"  kinect_to_mouse_delay_mean: {xdf_delay['kinect_to_mouse_delay']:.3f} s"
        )
        print(
            f"  kinect_to_mouse_delay_std : {xdf_delay['kinect_to_mouse_delay_std']:.6f} s"
        )
        print(
            "******************************************************************************************************"
        )


## Compute the kinect-to-mouse delay for all xdf files in the visit directory


# xdf_and_csv_files_in_visit = get_xdf_and_csv_files_in_visit(visit_path)
# xdf_csv_files = xdf_and_csv_files_in_visit["xdf_csv_files"]
# mandatory_files_error = xdf_and_csv_files_in_visit["error"]
# expected_files_warning = xdf_and_csv_files_in_visit["warning"]

# for xdf_csv_file in xdf_csv_files:
#     xdf_rel_fname = xdf_csv_file["xdf"]
#     csv_rel_fnames = xdf_csv_file["csv"]
#     print(f"\nVisit path: {visit_path}")
#     print(
#         f"xdf_csv_file: \nxdf: {xdf_rel_fname} and {len(csv_rel_fnames)} csv files"
#     )
#     for csv in csv_rel_fnames:
#         print(f"csv: {csv}")
#     print(f"\nMandatory files error: {mandatory_files_error}")
#     print(f"\nExpected files warning: {expected_files_warning}")

#     if not mandatory_files_error:
#         ktm_delays = get_kinect_to_mouse_delay_dict(xdf_rel_fname, csv_rel_fnames)
#         print_kinect_to_mouse_delay_dict(ktm_delays)
#         verify = verify_kinect_to_csv_mocap_delay(xdf_rel_fname, csv_rel_fnames)
#         print_verify_kinect_to_csv_mocap_delay(verify)