Skip to content

Latest commit

 

History

History
48 lines (34 loc) · 2.49 KB

File metadata and controls

48 lines (34 loc) · 2.49 KB

Overview

We expect an adt_quality_control HDF5 group at the root of the file, containing information about the quality control metrics and filters derived from the ADT counts. The group itself contains the parameters and results subgroups.

No ADT data was available prior to version 2.0 of the format, so the adt_quality_control group may be absent in such files.

Definitions:

  • adt_available: whether ADTs are present in the dataset. This is typically determined by examining the inputs step.
  • num_cells: number of cells in the dataset, prior to any filtering. This is typically determined from the inputs step.
  • num_blocks: number of blocks in the dataset. This is typically determined from the inputs step.

Parameters

parameters should contain:

  • igg_prefix: a scalar string containing the expected prefix for IgG features.
  • nmads: a scalar float specifying the number of MADs to use to define the QC thresholds.
  • min_detected_drop: a scalar float specifying the minimum relative drop in the number of detected features before a cell is considered to be low-quality.

Results

If adt_available = false, results should be empty.

If adt_available = true, results should contain:

  • metrics, a group containing per-cell QC metrics derived from the RNA count data. This contains:
    • sums: a float dataset of length equal to num_cells, containing the total count for each cell.
    • detected: an integer dataset of length equal to num_cells, containing the total number of detected features for each cell.
    • igg_total: a float dataset of length equal to num_cells, containing the total count in IgG features.
  • thresholds, a group containing thresholds on the metrics for each block. This group contains:
    • detected: a float dataset of length equal to num_blocks, containing the threshold on the total number of detected features for each block.
    • igg_total: a float dataset of length equal to num_blocks, containing the threshold on the total counts in IgG features for each block.
  • discards: an integer dataset of length equal to num_cells. Each value is interpreted as a boolean and specifies whether the corresponding cell would be discarded by the ADT-based filter thresholds.

History

Updated in version 3.0, with the following changes from the previous version:

  • The calculation of the QC metrics and thresholds can no longer be skipped; the application of those filters is determined in the later cell_filtering step.