Skip to content

Latest commit

 

History

History
58 lines (30 loc) · 5.48 KB

feature_subsets.md

File metadata and controls

58 lines (30 loc) · 5.48 KB

Feature sub-sets in SimBA

INTRODUCTION

SimBA extracts features for builing and running downstream machine learning models. However, at times, we may want to take advantage of some SimBA feature calculators and generate a subset of measurements, either for use in our own downstream applications, or as additional potentially informative features when building classifiers in SimBA. For some of the feature sub-sets available, see the image above.

INSTRUCTIONS

  1. To create feature sub-sets, first import your pose-esimation data into SimBA and follow the instruction up-to and including outlier correction as documented in the Scenario 1 tutorial. Before calculating feature sub-sets in SimBA, ensure that the project_folder/csv/outlier_corrected_movement_location directory of your SimBA project is populated with files.

2). Navigate to the [Extract features] tab, and click on the CALCULATE FEATURE SUBSETS button.

  1. In the SETTINGS frame, begin by selecting a directory where you want to store the output data. In the chosen directory, SimBA will store one CSV file per file in the project_folder/csv/outlier_corrected_movement_location directory. In these files, there will be one column per new feature and one row per video frame. It's a good idea to select an empty directory.

  2. We can choose to append these new feature subsets to our data in the project_folder/csv/features_extracted directory wit the aim of creating better classifiers. SimBA will then use these features when generating machine learning classification predictions, and the features will be included when annotating the specific videos. To append the new features to the extracted features data inside the project_folder/csv/features_extracted, tick the APPEND RESULTS TO FEATURES EXTRACTED FILES.

  3. If we have already annotated videos, then we may want to include these further features without having to re-annotate the data. To append the new features to the annotated data inside the project_folder/csv/targets_inserted, tick the APPEND RESULTS TO TARGETS INSERTED FILES.

Note: If we decide to append the results to either or both the features_extracted and targets_inserted files, and do not want to store the new features in a separate directory, then tick the relevant checkboxes and leave the SAVE DIRECTORY as `No folder selected.

7). When building classifiers, its important that all files representing each video has the same features and an equal number of features. Thus, before completing the appending of the new features, we may want to perform some integrity checks to confirm that each file has the same number of features and that the features have all the same names. To do this, check the INCLUDE INTEGRITY CHECKS BEFORE APPENDING NEW DATA CHECKBOX.

Note: If there are integrity issues, SimBA will print you an error message about the integrity issues, and stop the appending of the new feature data. Thus, if errors are discovered, SimBA will revert to the orginal project_folder/csv/features_extracted and/or project_folder/csv/targets_inserted file formats without appended data.

  1. Next, we need to select the features that we want to compute:

Tick the checkboxes for the features that we want to compute. Once complete, click the RUN button.

  1. Once complete, the SAVE DIRECTORY will be filled with one file for every video file represented in your project_folder/csv/outlier_corrected_movement_location directory (if you selected a save directory). In these files, every row represents a frame, and every column represents a feature in feature family. The number of columns (features) will depend on the number of body-parts and animals in your SimBA project. If you ticked the APPEND RESULTS TO FEATURES EXTRACTED FILES and/or APPEND RESULTS TO TARGETS INSERTED FILES, the files in these directories should also be populated with new features.

For smaller examples of expected output features, see: