Skip to content

JCVenterInstitute/DAFi-gating

Repository files navigation

User-directed unsupervised identification of cell populations

DAFi: User-directed unsupervised filtering and identification of cell populations from flow cytometry data

Authors:

Yu "Max" Qian, Ph.D., mqian@jcvi.org or qianyu.cs@gmail.com, Ivan Chang, Ph.D., ichang@jcvi.org, and Bob Sinkovits, Ph.D., sinkovit@sdsc.edu

Copyright: Authors and J. Craig Venter Institute

Paper link: https://onlinelibrary.wiley.com/doi/full/10.1002/cyto.a.23371

Lee AJ, Chang I, Burel JG, Lindestam Arlehamn CS, Mandava A, Weiskopf D, Peters B, Sette A, Scheuermann RH, Qian Y. DAFi: A directed recursive data filtering and clustering approach for improving and interpreting data clustering identification of cell populations from polychromatic flow cytometry data. Cytometry A, 2018. 93(6):597-610. PMCID: PMC6030426.

FlowJo Plugin

https://github.com/PedroMilanezAlmeida/DAFi

The plugin is developed by Dr. Pedro Milanez-Almeida of Tsang group of NIH/NIAID/CHI, with help and support by Dr. Josef Spidlen and Miguel Velazquez-Palafox of BD/FlowJo.

Description:

The DAFi package provides a new framework for cell population identification for flow cytometry (FCM) data. The framework is compatible with many existing clustering algorithms such as Kmeans, Kmeans++, mini-batch Kmeans, gaussian mixture models, k-medoids, self-organizing map, etc, and allows user to input user defined gating hierarchy to convert the aforementioned unsupervised algorithms into powerful semi-supervised and automatic cell population identification approach. First, the data is read and preprocessed, then configuration files are parsed and implemented as user defined gating directions. Clustering algorithm choosen in the initiation is then applied to the data events of the whole FCM data at the start of the iterative gating/filtering loop, and optionally again at each specified population subset (reclustering). At each gating step, the user can specify to apply bisecting (events filtering based on user defined boundaries just like in manual gating), slope-based (events filtering based on user defined slopes), or cluster centroids based filtering (filtering all events members of a cluster based on their centroid's inclusion or exclusion by the user defined gates). Outputs include population events and percentages table, as well as an events printout table consisting of all event's transformed channel values and population membership info for external analysis and plotting. Several built-in plotting options are also available, e.g. 2D dot plots of the user specified gating channels, and centroids overlay to the 2D dot plots.

Table of Folder Contents

  • C - DAFi written in C
  • FCSTrans - DAFi preprocessing of FCS files written in R via the flowCore package
  • Notebooks - Jupyter Notebook templates used for DAFi report generation and data analysis
  • Python - Accessory python scripts for DAFi pipeline
  • R - DAFi written in R
  • docker - Docker container build definitions for DAFi-jupyter and R-DAFi
  • docs - R-DAFi documentations generated by pkgdown
  • inst - DAFi installation and testing support files
  • man - R-DAFi manual pages
  • vignettes - R-DAFi vignettes

Supported Environments:

There are two concurrent implementations of the DAFi framework, one for the HPC environment and uses optimized C binary codes to provide extensive parallelization for large datasets, while the other is for the desktop environment and uses existing R-based packages such as flowCore, FlowSOM, and ClusterR to provide flexibility of choosing different clustering algorithms and recursive filtering strategies. Both versions’ source codes and binary releases are available through the github repository, as well as their docker images for trouble free installation.

Input Requirements (see inst/extdata for sample inputs):

  1. A raw FCS file for the R implementation of DAFi or a transformed text-based FCS file for the C implementation of DAFi.

  2. inclusion.config: a 12-column tab delimited file, for recursive data filtering

Pop_ID DimensionX DimensionY Min_X Max_X Min_Y Max_Y Parent_ID Cluster_Type(0: Clustering; 1: Bisecting; 2: Slope-based) Visualize_or_Not Recluster_or_Not Cell_Phenotype(optional)
1 1 4 20 70 5 55 0 0 0 0 Lymphocyte
2 1 2 30 90 0 110 1 1 0 1 Singlets
3 4 5 100 150 80 140 2 2 1 1 LiveSinglets
4 19 17 76 140 106 200 3 1 0 1 CD4T
5 19 17 76 140 55 105 3 1 0 0 CD8T
6 8 7 81 140 50 120 3 1 0 0 CD4Treg
7 8 7 20 80 25 90 3 1 0 0 CD4Tnonreg
  1. exclusion.config: a 11-column tab delimited file with the same format, but for reversed filtering:
Pop_ID DimensionX DimensionY Min_X Max_X Min_Y Max_Y Parent_ID Cluster_Type(0: Clustering; 1: Bisecting; 2: Slope-based) Visualize_or_Not Recluster_or_Not
1 1 4 0 85 100 200 0 0 1 0

Get the software and documentations

Check the releases to obtain the latest release

R

install into R

For the DAFi R implementation framework, R version > 3.4 is required (https://cran.r-project.org/bin/). In addition, please have installed:

  1. flowCore (https://www.bioconductor.org/packages/release/bioc/html/flowCore.html)
  2. flowViz (http://bioconductor.org/packages/release/bioc/html/flowViz.html)
  3. ClusterR (https://cran.r-project.org/web/packages/ClusterR/index.html)
  4. FlowSOM (https://bioconductor.org/packages/release/bioc/html/FlowSOM.html)

For automated build and install, including dependent packages listed above, please install the devtools library

install.packages("devtools")

so you can initiate the automated install of DAFi package

devtools::install_github("JCVenterInstitute/DAFi-gating", build_vignettes = TRUE)

then checkout the built-in vignette for the DAFi library for more documentation

library(DAFi)
browseVignettes(DAFi)

C (see details under src)

icc or gcc compilers required for compiling binary from source.

For optimal performance please compile with intel optimization flags:

icc -O3 -xHost -o dafi_gating DAFi-gating_omp.c -lm

In addition, precompiled binaries without enhanced optimizations is available under releases

DAFi docker images (see details under docker folder)

If you have docker containerization system enabled, you can download the pre-configured Dockerbuild and build the DAFi dockerized container that will allow you to run a local Jupyter or R-studio server with all necessary packages. You can also find the dafi-jupyter and r-dafi containers on the docker hub