Skip to content

Latest commit

 

History

History
83 lines (74 loc) · 3.79 KB

paper.md

File metadata and controls

83 lines (74 loc) · 3.79 KB
title tags authors affiliations date bibliography
diff\_classifier: Parallelization of multi-particle tracking video analyses
Python
multi-particle tracking
Amazon Web Services
parallelization
name orcid affiliation
Chad Curtis
0000-0001-6312-392X
1
name orcid affiliation
Ariel Rokem
0000-0003-0679-1985
2
name orcid affiliation
Elizabeth Nance
0000-0001-7167-7068
1
name index
Department of Chemical Engineering, University of Washington
1
name index
eScience Institute, University of Washington
2
20 August 2018
paper.bib

Summary

The diff_classifier package seeks to address the issue of scale-up in multi-particle tracking (MPT) analyses via a parallelization approach. MPT is a powerful analytical tool that has been used in fields ranging from aeronautics to oceanography to biomedical engineering [@Pulford:2005]. While a variety of tracking algorithms are available to researchers [@Chenouard:2014], a common problem is that data analysis usually depends on the use of graphical user interfaces, and relies on human input for accurate tracking. For example, particle detection often relies on the selection of a quality threshold, a numerical quantity distinguishing between “real” particles and “fake” particles [@Tineves:2017]. If this threshold is too high, the high number of false positives can result in false trajectories that skew results, and in extreme cases, cause the code to crash due to a lack of convergence in the particle linking step. If the threshold is too low, trajectories will be cut short resulting in a bias towards short fast-moving trajectories and could result in empty datasets [@Wang:2015].

Due to variations in experimental conditions and image quality, user-selected tracking parameters can vary widely from video to video. As parameter selection can also vary from user to user, this also brings up the issue of reproducibility. diff_classifier addresses these issues with regression tools to predict input tracking parameters and parallelized script-based implementations in Amazon Web Services (AWS), using the Simple Storage Service (S3) and Batch for data storage and computing, respectively, and relying on the Cloudknot software library for automating these interactions [@cloudknot]. By manually tracking a small subset of the entire video dataset to be analyzed (5-10 videos per experiment), users can predict tracking parameters based on intensity distributions of input images. This can simultaneously reduce time-to-first-result in MPT workflows and provide reproducible MPT results.

diff_classifier also includes downstream MPT analysis tools including mean squared displacement and feature calculations, visualization tools, and a principle component analysis implementation. MPT is commonly used to calculate and report ensemble-averaged diffusion coefficients of nanoparticles and other objects. We sought to expand the power of MPT analyses by changing the unit of analysis to individual particle trajectories. By including a variety of features (e.g. aspect ratio, boundedness, fractal dimension), with trajectory-level resolution, users can implement a range of data science analysis techniques to their MPT datasets.

The source code for diff_classifier has been archived to Zenodo with the linked DOI: [@zenodo]

Acknowledgements

The authors would like to thank the eScience Institute for the resources and expertise provided through the Incubator Program that made diff_classifier possible. The University of Washington eScience Institute is supported through a grant from the Gordon & Betty Moore Foundation and the Alfred P. Sloan Foundation. The authors would also like to thank funding from the National Institute of General Medical Sciences 1R35 GM124677-01 (E. Nance).

References