# Identiying copy number variants in single cell sample
Usually cancer samples have releatively higher number of copy number variants as compared to normal cells as well as immune cells. Therfore, it makes it further easier to validate our marker and pathway based annotations by helping us identify likely cancer clusters.


1. [Guinti et al. Genome-wide copy number analysis in pediatric glioblastoma multiforme. Am J Cancer Res. 2014 May 26;4(3):293–303.](https://pmc.ncbi.nlm.nih.gov/articles/PMC4065410/) 

- <b>Literature Results</b>
   - Recurrent 9p21.3 and 16p13.3 deletions and 1q32.1-q44 duplication play a crucial role for tumorigenesis and/or progression. - A2BP1 gene (16p13.3) is one possible culprit of the disease.
- <b>Results discussed from other literatures</b>
   - Presence of two common regions of loss of heterozygosity (LOH) in 9p24.3-9p13.1 and 17p13.3 both in pediatric and adult GBMs [Ref](https://pmc.ncbi.nlm.nih.gov/articles/PMC2940568/). 
   - Recurrent duplications of 1q, 3q, 2q and 17q as well as losses of chromosomal regions in 6q, 8q, 13q, and 17p have been described [Ref](https://pmc.ncbi.nlm.nih.gov/articles/PMC1891902/). 
   - [Paugh et al.](https://pmc.ncbi.nlm.nih.gov/articles/PMC2903336/) also suggested that pediatric and adult GMBs were clearly distinguished by frequent gain of chromosome 1q and lower frequency of chromosome 7 gain and 10q loss

In [None]:
source("../Utility/infercnv")

### Load annotated seurat data

In [None]:
seurat_obj <- readRDS("../Step5_Clustering/out/seurat_clustered.rds")

### This step takes generated input files for infercnv run

In [None]:
create_infercnv_input(seurat_obj, 
                                gencode_path="../Utility/Data/inferCNV_inputs/gencode_v19_gene_pos.txt", 
                                output_dir="./out/infercnv_input"
                                ) 

### Final infercnv run
Here we use NK cell as a reference, as immune cell is one of the safest pick to compare copy number variants in order to identify cancer cell clusters

In [None]:
infercnv_obj <- run_infercnv(
                            raw_counts_matrix="./in/inferCNV_inputs/merged_gene_matrix.txt",
                            annotations_file="./in/inferCNV_inputs/merged_annotation.txt",
                            gene_order_file="./in/inferCNV_inputs/inferCNV_gene_order.txt",
                            ref_group_names==c("NK cells"),
                            cutoff=0.1,
                            num_threads=10,
                            output_dir="./out/inferCNV_output"
                ) 