Running InferCNV

InferCNV can be run via a simple 2-step protocol, or can be run step-by-step with customization for more exploratory purposes.

InferCNV 2-step execution:

Creating an InferCNV object based on your three required inputs: the read count matrix, cell type annotations, and the gene ordering file:

# create the infercnv object
infercnv_obj = CreateInfercnvObject(raw_counts_matrix="singleCell.counts.matrix",
                                    annotations_file="cellAnnotations.txt",
                                    delim="\t",
                                    gene_order_file="gene_ordering_file.txt",
                                    ref_group_names=c("normal"))

where the ref_group_names parameter is set to the various normal-cell type (non-tumor) as defined in the cellAnnotations.txt file. See File-Definitions for more details here.

After creating the infercnv_obj, you can then run the standard infercnv procedure via the built-in 'infercnv::run()' method like so:

# perform infercnv operations to reveal cnv signal
infercnv_obj = infercnv::run(infercnv_obj,
                             cutoff=1,  # use 1 for smart-seq, 0.1 for 10x-genomics
                             out_dir="output_dir",  # dir is auto-created for storing outputs
                             cluster_by_groups=T,   # cluster
                             include.spike=T
                             )

The cutoff value determines which genes will be used for the infercnv analysis. Genes with a mean number of counts across cells will be excluded. For smart-seq (full-length transcript sequencing, typically using cell plate assays rather than droplets), a value of 1 works well. For 10x (and potentially other 3'-end sequencing and droplet assays, where the count matrix tends to be more sparse), a value of 0.1 is found to generally work well.

The out_dir is given an output directory name. If the directory doesn't exist, it will be created directly.

The 'cluster_by_groups' setting indicates to perform separate clustering for the tumor cells according to the patient type, as defined in the cell annotations file.

If 'include.spike' is enabled, infercnv will artificially inject a spike-in with defined gain/loss thresholds based on the normal cells. These spiked-in data will be tracked throughout the various infercnv data manipulations, and finally used to scale the data to complete loss (0x) to a gain of 2x. The spiked-in data is then removed before generating the final outputs.

InferCNV step-by-step exploratory execution:

The general infercnv workflow as performed via the above infercnv::run() method operates as follows:

This process can be executed step-by-step as outlined in this Rmd: example.html

Interactive Data Exploration

To interactively explore the inferCNV heatmap, see our documentation here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running InferCNV

Running InferCNV

InferCNV 2-step execution:

InferCNV step-by-step exploratory execution:

Interactive Data Exploration

Clone this wiki locally