Techical Docs

Overview

This page provides detailed instructions on how to use the applications in the repository by explaining how they are currently being used in the context of the DevEx research project, specifically in relation to the first set of experiments. It also includes documentation for the modules and functions that make up libratools and DevExDashboard.

Setting up a Data Collection Pipeline

To start using the applications in this repository, it is assumed that a production ready data collection pipeline has already been set up. As suggested in Getting Started, such a pipeline may involve generating MP4 recordings using motif and tracking each video recording using BioTracker (as DevEx does) so that one is left with individual trajectories. These may be stored locally, in the cloud or using a network-attached storage (NAS) device. It is also assumed that this repository has been cloned/installed locally (see the separate repo-install-instructions).

For illustration, the output of your data collection pipeline may look as follows:

├──loopbio_data/    # <-- root folder on a NAS
  ├──camSN/    # <-- short for camera serial number
      ├──dateOfRecording_startTimeofRecording.camSN/     
          ├──000001.npz     # <-- created by Motif
          ├──000001.mp4     # <-- created by Motif
          ├──000002.npz 
          ├──000002.mp4
          ├──metadata.yaml      # <-- created by Motif
          ├──camSN_dateOfRecording_chunkNumber.csv    # <-- generated by BioTracker (e.g.: 23520258_20210408_02.csv)
          ├──camSN_dateOfRecording_processed.csv    # <-- generated using libratools (e.g.: 23520258_20210408_processed.csv)

Creating a Data Processing Workflow

After data collection, a script can be created that relies on the functions provided by libratools to conduct pre-processing and some automated post-processing tasks. In the case of the first DevEx experiment, this script is process.py, which is stored in the Processing/ folder. This script can be configured using the config.ini file contained in the same folder which includes paths, default values for free parameters, and other global variables. The most important variable is DATA_DIR, which tells the script where the data is stored (discussed further in Choosing Default Parameter Settings).

In the above example folder structure, the file 23520258_20210408_processed.csv is the final trajectory outputted by process.py, which can in turn be visualized using DevExDashboard. In the above case, where there are multiple recordings, this file will be a merger of the various CSV files corresponding to individual MP4 files with a fixed number of frames, or 'chunks', generated by BioTracker. The number of chunks is set by the StoreChunkSize variable in motif and stored in the metadata.yaml file (to generate vide chunks equivalent to 33 minutes of video at 5 FPS this would need to be set to 10000).

As described in the repo-install-instructions, each of the BioTracker-genrated CSV files that need to be processed can be located by their camera serial number. Hence, alongside process.py and config.ini, there is a camera_ids.yaml file in the Processing/ folder. By default, tracks will be processed for all cameras listed in this file (discussed further in Choosing Default Parameter Settings).

How libratools can be used

libratools is itself made up of several modules which each contain specific functions to handle tasks associated with loading, pre-processing, and post-processing trajectories. These can be contained in one or more scripts to handle pre-processing and post-processing tasks. For a detailed overview of each function, see the module's source code linked in the summary table in the below section libratools modules.

How to use `process.py`

If using a production ready data pipeline that relies on motif for recording and BioTracker for tracking, process.py is a ready-made script that processes BioTracker-generated CSV files. To use it you simply need to run one command after configuring the script and deciding which cameras to process trajectories for or choosing to rely on the default settings (see the section Choosing Default Parameter Settings for libratools, process.py and DevExDashboard). First open the terminal and change into the directory developing-exploration-behavior/Processing (i.e., into your clone of the repository; if you are using one of the tracking PCs in the office, use powershell). Then run:

$ python process.py -d YYYYMMDD

where YYYYMMDD should be the date for which you want to process trajectory files generated by BioTracker. By default, the merged and processed trajectories will be saved to the same folder where the raw files are stored.

Note that process.py is a custom script written specifically for DevEx hence it provides just one use-case of how to make use of all of the methods offered by libratools. It also does not contain detailed docstrings, relying instead on the clarity of code and the docstrings that each libratools function has. For an overview of the functions implemented in process.py, which should all be self-explanatory, see the below table:

All functions
`main()`	Call date input and run processing pipeline function.
`get_input_args()`	Return camera ids and date for which to process CSV files.
`run_pipeline()`	Run processing pipeline and save processed trajectory to disk.
`locate_data()`	Return file paths to loopbio NPZ files and Biotracker-generated CSV files.
`load_data()`	Return merged trajectory segments as pandas.DataFrame
`preprocess_data()`	Load and impute missing merged trajectory.
`postprocess_data()`	Compute metrics and implement outlier detection.

libratools modules

The package functions are conveniently documented at the package website: https://vincejstraub.github.io/tools-libratools/.

Home | Getting Started | Technical Docs | User Agreement

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Techical Docs

Overview

Setting up a Data Collection Pipeline

Creating a Data Processing Workflow

How libratools can be used

How to use `process.py`

libratools modules

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

Techical Docs

Overview

Setting up a Data Collection Pipeline

Creating a Data Processing Workflow

How libratools can be used

How to use process.py

libratools modules

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally

How to use `process.py`