<a href="https://colab.research.google.com/github/Toshea111/sleap/blob/main/docs/notebooks/Converting_SLEAP_Analysis_HDF5_to_CSV_Updated.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

First, we'll upload a HDF5 file that was generated from within the SLEAP GUI. This can be created by opening a tracked project file (`.slp`) and going to **File** -> **Export Analysis HDF5...**

Note that you can also upload the file by navigating the sidebar on the left side of the page in Colab.

In [None]:
from google.colab import files

uploaded = files.upload()
h5_filepath = list(uploaded.keys())[0]

print(f"h5_filepath = {h5_filepath}")

Saving Termite Test.h5 to Termite Test (1).h5
h5_filepath = Termite Test.h5


Once you have the file uploaded, let's open it, load its contents and inspect the data.

In [None]:
import numpy as np
import pandas as pd
import h5py

# Open the HDF5 file using h5py.
with h5py.File(h5_filepath, "r") as f:

  # Print a list of the keys available.
  print("Keys in the HDF5 file:", list(f.keys()))

  # Load all the datasets into a dictionary.
  data = {k: v[()] for k, v in f.items()}

  # Here we're just converting string arrays into regular Python strings.
  data["node_names"] = [s.decode() for s in data["node_names"].tolist()]
  data["track_names"] = [s.decode() for s in data["track_names"].tolist()]

  # And we just flip the order of the tracks axes for convenience.
  data["tracks"] = np.transpose(data["tracks"])

  # And finally convert the data type of the track occupancy array to boolean.
  # We'll see what this array is used for further down.
  data["track_occupancy"] = data["track_occupancy"].astype(bool)


# Describe the values in the data dictionary we just created.
for key, value in data.items():
  if isinstance(value, np.ndarray):
    print(f"{key}: {value.dtype} array of shape {value.shape}")
  else:
    print(f"{key}: {value}")

Keys in the HDF5 file: ['edge_inds', 'edge_names', 'instance_scores', 'labels_path', 'node_names', 'point_scores', 'provenance', 'track_names', 'track_occupancy', 'tracking_scores', 'tracks', 'video_ind', 'video_path']
edge_inds: int32 array of shape (3, 2)
edge_names: |S10 array of shape (3, 2)
instance_scores: float64 array of shape (5, 979)
labels_path: b'C:/Users/tao213/Videos/Exeter Termite Videos/SLEAP/labels.v001.slp'
node_names: ['Head', 'Thorax', 'Abdomen', 'Tail']
point_scores: float64 array of shape (5, 4, 979)
provenance: b'{}'
track_names: ['track_0', 'track_1', 'track_2', 'track_3', 'track_4']
track_occupancy: bool array of shape (979, 5)
tracking_scores: float64 array of shape (5, 979)
tracks: float64 array of shape (979, 4, 2, 5)
video_ind: 1
video_path: b'C:/Users/tao213/Videos/Termite Test Shortened.mp4'


The `data["tracks"]` array has the raw tracking coordinates, with axes corresponding to `(frames, nodes, xy, tracks)`.

In this case we don't have data for every frame since we just tracked a small clip; this is indicated by the `data["track_occupancy"]` array.

First, let's find all the frames that have at least one animal tracked.

In [None]:
valid_frame_idxs = np.argwhere(data["track_occupancy"].any(axis=1)).flatten()
valid_frame_idxs

array([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23,  24,  25,
        26,  27,  28,  29,  30,  31,  32,  33,  34,  35,  36,  37,  38,
        39,  40,  41,  42,  43,  44,  45,  46,  47,  48,  49,  50,  51,
        52,  53,  54,  55,  56,  57,  58,  59,  60,  61,  62,  63,  64,
        65,  66,  67,  68,  69,  70,  71,  72,  73,  74,  75,  76,  77,
        78,  79,  80,  81,  82,  83,  84,  85,  86,  87,  88,  89,  90,
        91,  92,  93,  94,  95,  96,  97,  98,  99, 100, 101, 102, 103,
       104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116,
       117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129,
       130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142,
       143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155,
       156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168,
       169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 18

Great, so now let's build up a `tracks` table where each row contains the detected body part coordinates for a single animal in a single frame.

In [None]:
tracks = []
for frame_idx in valid_frame_idxs:
  # Get the tracking data for the current frame.
  frame_tracks = data["tracks"][frame_idx]

  # Loop over the animals in the current frame.
  for i in range(frame_tracks.shape[-1]):
    pts = frame_tracks[..., i]
    
    if np.isnan(pts).all():
      # Skip this animal if all of its points are missing (i.e., it wasn't
      # detected in the current frame).
      continue
    
    # Let's initialize our row with some metadata.
    detection = {"track": data["track_names"][i], "frame_idx": frame_idx}

    # Now let's fill in the coordinates for each body part.
    for node_name, (x, y) in zip(data["node_names"], pts):
      detection[f"{node_name}.x"] = x
      detection[f"{node_name}.y"] = y

    # Add the row to the list and move on to the next detection.
    tracks.append(detection)

# Once we're done, we can convert this list of rows into a table using Pandas.
tracks = pd.DataFrame(tracks)

tracks.head()

Unnamed: 0,track,frame_idx,Head.x,Head.y,Thorax.x,Thorax.y,Abdomen.x,Abdomen.y,Tail.x,Tail.y
0,track_0,0,592.777161,97.110588,576.715149,100.803192,553.217651,113.282112,532.867493,133.286713
1,track_1,0,368.668304,308.037445,388.087433,323.859894,404.666016,336.483734,424.34671,355.790924
2,track_2,0,620.325562,356.645355,612.294312,340.940796,596.647156,317.063782,584.482727,293.440491
3,track_3,0,659.702698,380.244568,672.066101,368.412384,683.910889,348.252716,692.080139,332.240387
4,track_4,0,444.168823,436.377838,468.142273,440.575226,492.255096,440.866028,512.486084,444.683167


Finally, we can save the table we just generated into a CSV file and download it for further analysis.

In [None]:
tracks.to_csv("tracks.csv", index=False)
files.download("tracks.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>