-
Notifications
You must be signed in to change notification settings - Fork 159
Running InferCNV
InferCNV can be run via a simple 2-step protocol, or can be run step-by-step with customization for more exploratory purposes.
Creating an InferCNV object based on your three required inputs: the read count matrix, cell type annotations, and the gene ordering file:
# create the infercnv object
infercnv_obj = CreateInfercnvObject(raw_counts_matrix="singleCell.counts.matrix",
annotations_file="cellAnnotations.txt",
delim="\t",
gene_order_file="gene_ordering_file.txt",
ref_group_names=c("normal"))
where the ref_group_names parameter is set to the various normal-cell type (non-tumor) as defined in the cellAnnotations.txt file. See File-Definitions for more details here.
After creating the infercnv_obj, you can then run the standard infercnv procedure via the built-in 'infercnv::run()' method like so:
# perform infercnv operations to reveal cnv signal
infercnv_obj = infercnv::run(infercnv_obj,
cutoff=1, # use 1 for smart-seq, 0.1 for 10x-genomics
out_dir="output_dir", # dir is auto-created for storing outputs
cluster_by_groups=T, # cluster
include.spike=T
)
The cutoff value determines which genes will be used for the infercnv analysis. Genes with a mean number of counts across cells will be excluded. For smart-seq (full-length transcript sequencing, typically using cell plate assays rather than droplets), a value of 1 works well. For 10x (and potentially other 3'-end sequencing and droplet assays, where the count matrix tends to be more sparse), a value of 0.1 is found to generally work well.
The out_dir is given an output directory name. If the directory doesn't exist, it will be created directly.
The 'cluster_by_groups' setting indicates to perform separate clustering for the tumor cells according to the patient type, as defined in the cell annotations file.
If 'include.spike' is enabled, infercnv will artificially inject a spike-in with defined gain/loss thresholds based on the normal cells. These spiked-in data will be tracked throughout the various infercnv data manipulations, and finally used to scale the data to complete loss (0x) to a gain of 2x. The spiked-in data is then removed before generating the final outputs.
The general infercnv workflow as performed via the above infercnv::run() method operates as follows:
![](images/InferCNV_procedure.png)
This process can be executed step-by-step as outlined in this Rmd: example.html
To interactively explore the inferCNV heatmap, see our documentation here.
- InferCNV Home
- Quick Start
- Installing inferCNV
- Running InferCNV
- Applying Noise Filters
- Predicting CNV via HMM
- Bayesian Mixture Model
- Tumor heterogeneity - define tumor subclusters
- Interpreting the Figure
- Inputs to InferCNV
- Outputs from InferCNV
- More inferCNV example data sets
- Using 10x data
- Interactively navigating data using the Next Generation Heatmap Viewer
- Extracting HMM features
- FAQ and common issues