Skip to content
mdelmans edited this page Jan 3, 2019 · 6 revisions

Running a Command Line Tool

To run D3E analysis please run D3ECmd.py:

python D3ECmd.py InputFile OutputFile Label1 Label2 [-h] [-m {0,1}] [-t {0,1,2}] [-z {0,1}] [-n {1,0}] [-s {1,0}] [-v]

Mandatory arguments:

  • InputFile : path to the read-count table
  • OutputFile : path to the output file
  • Label1 : a common label for the cells of the first type
  • Label2 : a common label for the cells of the second type

Optional arguments

  • -m (--mode) : run mode (default = 1)
  • -test (--test) : test for distribution comparison
  • -z (--removeZeros) : if -z is set to 1, all zero entities in the read-count table will be removed prior to analysis (default = 0)
  • -n (--normalise) : if -n is set to 1, a normalization routine will be performed before analysis (default = 1)
  • -v (--verbose) : run in verbose mode (default)

Input File Format

D3ECmd.py accepts a tab-separated read-count table, where rows correspond to genes, and columns correspond to individual cells. The file should have a header row which has the following tab-separated format:

"GeneID	Label<sub>1</sub>	Label<sub>2</sub>	Label<sub>3</sub>	... "

where Li are the cell type labels. Differential expression analysis can be performed on two cell types at a time.

Each line should start with a gene ID, followed by a sequence of read-counts. Empty lines are ignored.

Run Modes

D<sup3 can run in two modes, specified by -m (--mode) option:

  • Mode 0 : Methods of moments is used to estimate parameters ( fast but less accurate )
  • Mode 1 : Bayesian inference is used to estimate parameters ( slow but more accurate, default )

Distribution comparison methods

D<sup3 uses one of the following distribution comparison tests, specified by -t (--test) option:

  • Mode 0: Cramer-von Mises test
  • Mode 1: Kolmogorov-Smirnov test
  • Mode 2: Anderson-Darling test
  • Mode 3: Likelihood ratio test

Output

D3E produces a table as an output with the following columns:

  • Gene id : ID of a gene, that matches ID in the input file.
  • a, b, g : Parameter values of the fitted transcriptional bursting model.
  • gof : Goodness of transcriptional bursting model fit.
  • s, f, d : Average burst size, expression frequency and duty cycle of a gene.
  • Rs, Rf, Rd : Log2 fold-change of the corresponding parameters.
  • p-value : p-value for the null-hypothesis, that two genes are not differentially expressed.
  • μ, cv : mean and coefficient of variation of the expression level in each corresponding cell type.
Clone this wiki locally