Skip to content
ArthurDondi edited this page Sep 22, 2024 · 7 revisions

Welcome to the LongSom Wiki!

LongSom: detection of somatic variants in long-read scRNA-seq data.

LongSom is a tool for detecting somatic SNVs (including mitochondrial ones, "mtSNVS"), fusions, and CNAs in high-quality (PacBio, Nanopore R10.4) long-read scRNA-seq data from cancer biopsies, and subsequently reconstructing subclonal heterogeneity, based on those variants.

How it works

LongSom takes an aligned bam file and a barcode-to-cell-type file as input. It detects variants in both cancer and non-cancer cells, then calls somatic variants, i.e. variants unique to cancer cells.

LongSom first corrects the cell type annotation based on the mutational burden of cells. The idea is that cancer cells misannotated as non-cancer will cause somatic variants to be detected in non-cancer cells and filtered out as germline variants (false negatives). To avoid this, LongSom detects cancer cells misannotated as non-cancer and reannotates them as cancer cells.

After reannotation, LongSom calls SNVs using a modified version of SComatic and fusions using ctat-LR-fusion. It then uses Bayesian non-parametric clustering BnpC to cluster cells into subclones based on those somatic SNVs and fusions. In parallel, LongSom uses inferCNV to call CNAs and cluster cells into subclones based on them.

For more information, see our workflow below or read Dondi et al. 2024.

Workflow

Flowchart

Clone this wiki locally