No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

Ultrafast clustering of single-cell flow cytometry data using FlowGrid

Authors: Xiaoxin Ye and Joshua W. K. Ho
Copyright © 2018, Victor Chang Cardiac Research Institute

Input data format

Our FlowGrid algorithm could be applied into many format data set but the sample code only accept csv format. In the csv file, the first row is feature name and each columns is seperated by ",". If you have true label file , you could use --l filename to input label file for testing the ARI of FlowGrid result.


Before using the package, we need to install the dependent package sklearn and numpy.

pip install -r requirements.txt --user


pip install sklearn numpy scipy --user


A summary of the argument of sample code is included in the table below.

Argument Usage Required?
--f the input file name required
--n number of bins required
--eps maximun distance between two bins required
--t threshold for high density bin optional (default:40)
--o the output file name optional (default: out.csv)
--l the true label file name optional


After installing all the dependent packages, you could try to use the sample code to run FlowGrid on the sample data.

python --f sample_data.csv --n 4  --eps 1.1 --l sample_label.csv

The predicted label is saved at out.csv and the sample result is as follow.

The number of cells is: 23377
The number of dimensions is: 4
runing time: 0.027