You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Overlay keypoints on the labeled frames: visualize the difference between the prediction and ground truth.
Plot difference between prediction and ground truth across time (video frame): that would be easily to locate the specific time of the video where the model makes big errors, and helps scientists to detect outliers or label more frames for training.
The above features, as well as similar ones pertaining to model evaluation, would require us to have access to the ground truth used during training - i.e. the labeled (annotated) poses. Currently, movement only loads predicted poses - i.e. the positions output by the model during inference. That said, many pose estimation frameworks store labeled and predicted poses in similar (or identical) format, so writing loader functions for the frameworks we already support should not be hard.
What is needed
Sample data files containing labeled poses from each supported pose estimation framework (currently DeepLabCut, SLEAP, LightningPose). These would need to be added to our GIN data repository.
Loading functions that can load these files. At minimum, we need the poses themselves (i.e. the positions of keypoints), as well as the index of the labelled frame (within the video). Loading the frames (images) directly could also be done, but I don't think is strictly necessary.
Potential interface
I'm still unsure regarding the form these functions would take. Some ideas/alternatives below (in order of my current preference).
Use the existing load_poses module as is, but add an optional argument for also loading the labels.
frommovement.ioimportload_posesds=load_poses.from_dlc_file(poses_file_path, labels_file_path="/path/to/labels.csv", fps=30)
# If labels_file_path is passed, it should create an extra "labels" data variablepredicted_poses=ds["pose_tracks"] # to be renamed as "position"confidence=ds["confidence"]
labels=ds["labels"]
Sometimes (e.g. in SLEAP's .slp files), both user-labeled and predicted poses will be in the same file. So we would have to think a bit more how to handle these cases
Keep using the load_poses module but add a boolean argument to indicate if the data are predictions or labels. They would be loaded as separate xarray.Dataset objects
Now that I've thought about it a bit more, option 1 doesn't make much sense. When loading predicted poses, the almost always come from a single video (that's the case for DLC, LigthningPose, and SLEAP analysis files), while label poses are usually defined over a set of frames drawn from a group of multiple videos. Probably we'd have to do something close to option 2 or 3.
Context
@Di-Wang-AIND has suggested the following features (see the full message on Zulip):
The above features, as well as similar ones pertaining to model evaluation, would require us to have access to the ground truth used during training - i.e. the labeled (annotated) poses. Currently,
movement
only loads predicted poses - i.e. the positions output by the model during inference. That said, many pose estimation frameworks store labeled and predicted poses in similar (or identical) format, so writing loader functions for the frameworks we already support should not be hard.What is needed
Potential interface
I'm still unsure regarding the form these functions would take. Some ideas/alternatives below (in order of my current preference).
Use the existing
load_poses
module as is, but add an optional argument for also loading the labels.Sometimes (e.g. in SLEAP's .slp files), both user-labeled and predicted poses will be in the same file. So we would have to think a bit more how to handle these cases
Keep using the
load_poses
module but add a boolean argument to indicate if the data are predictions or labels. They would be loaded as separatexarray.Dataset
objectsMake completely separate functions for predicted vs labeled data, e.g.:
I haven't fully thought through all the implications of each approach, and there may be other alternatives I haven't considered.
The text was updated successfully, but these errors were encountered: