Skip to content

Commit

Permalink
Update quant.rst
Browse files Browse the repository at this point in the history
correct link to paper
  • Loading branch information
rob-p committed Jun 1, 2022
1 parent 4483e62 commit 4b0a086
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/quant.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ output

The output of the ``quant`` command consists of 5 files: ``quants_mat_rows.txt``, ``quants_mat.mtx`` (or ``counts.eds.gz`` if run with the ``--use-eds`` flag), ``quants_mat_cols.txt``, ``quant.json``, and ``featureDump.txt``. The ``quant.json`` file contains information about the quantification run, such as the method used for UMI resolution. The ``featureDump.txt`` file contains cell-level information designed to be useful in post-quantification cell filtering (better determining "true" cells from background, noise, doublets etc.). The other three files all correspond to quantification information.

If ``quant`` was executed in USA mode, then the resulting count matrix will be of dimension ``C``x``3G`` where ``C`` is the number of quantified cells (barcodes) and ``G`` is the number of genes. This is because, in USA mode, ``alevin-fry`` quantifies the UMI count attributable to each splicing state of each gene in each cell, where the splicing state is one of spliced (S), unspliced (U) or ambiguous (A). If ``quant`` was run with a two-column transcript-to-gene map (not in USA-mode), then the resulting count matrix will be a ``C``x``G`` matrix, as splicing status is not tracked. For more details on USA mode and its uses, please read the ``alevin-fry`` `paper https://www.nature.com/articles/s41592-022-01408-3`__ or `preprint <https://www.biorxiv.org/content/10.1101/2021.06.29.450377v1>`__, or the `corresponding tutorial <https://combine-lab.github.io/alevin-fry-tutorials/2021/improving-txome-specificity/>`__.
If ``quant`` was executed in USA mode, then the resulting count matrix will be of dimension ``C``x``3G`` where ``C`` is the number of quantified cells (barcodes) and ``G`` is the number of genes. This is because, in USA mode, ``alevin-fry`` quantifies the UMI count attributable to each splicing state of each gene in each cell, where the splicing state is one of spliced (S), unspliced (U) or ambiguous (A). If ``quant`` was run with a two-column transcript-to-gene map (not in USA-mode), then the resulting count matrix will be a ``C``x``G`` matrix, as splicing status is not tracked. For more details on USA mode and its uses, please read the ``alevin-fry`` `paper <https://www.nature.com/articles/s41592-022-01408-3>`__ or `preprint <https://www.biorxiv.org/content/10.1101/2021.06.29.450377v1>`__, or the `corresponding tutorial <https://combine-lab.github.io/alevin-fry-tutorials/2021/improving-txome-specificity/>`__.

The ``quants_mat.mtx`` is a matrix market `coordinate format <https://math.nist.gov/MatrixMarket/formats.html>`__ file (or if running with ``--use-eds`` then ``counts.eds.gz`` is a gzipped file in EDS_ format) that stores the gene-by-cell expression matrix. The two other files provide the labels for the rows and columns of this matrix. The ``quants_mat_cols.txt`` file is a text file that contains the names of the rows of the matrix, in the order in which it is written, with one gene name written per line. The ``quants_mat_rows.txt`` file is a text file that contains the names of the columns of the matrix, in the order in which it is written, with one barcode name written per line.

Expand Down

0 comments on commit 4b0a086

Please sign in to comment.