Skip to content
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


CNVrd2: A package for measuring gene copy number, identifying SNPs tagging copy number variants, and detecting copy number polymorphic genomic regions

Install and use

Download the file


Install the package

R CMD INSTALL CNVrd2_1.9.1.tar.gz

Please see the file CNVrd2.pdf

Window users can use the link of the Bioconductor Project:

####To detect CNVRs, please use the function SRBreak in:

Notes: using the 1000 Genomes data

Please read information below or see the file using1000Genome

Please go to QuestionsAndAnswers to take a quick look at asked questions about the package.


This note describes some simple steps for using the data from the 1000 Genomes Project

Bam files

Download an index file


Obtain a list of bam files (the first column)

cat 20130502.low_coverage.alignment.index |awk '{print $1}'|grep '\.mapped' > listbam.txt

There are 2535 bam files on the page Therefore, we can make a list of these bam files and their full links.

awk '{print ""$0}' < listbam.txt > listbamAndFullLinks.txt

We can choose a population (or multiple populations) to find tagSNPs. For example, here we choose the Mexican Ancestry in Los Angeles (MXL) population and find tagSNPs for FCGR3B gene.

cat listbamAndFullLinks.txt|grep "MXL" > listMXL.txt 

The gene is at chr1:161592986-161601753, so we will use samtools (Li et al. 2009 ) to download a 1Mb region around the gene: chr1:161100000-162100000.

while read line
tempName=$(echo $line|awk -F"/" '{print $NF}')
samtools view -hb $line 1:161100000-162100000 > $tempName
done < listMXL.txt 

After downloading, we can use samtools to keep only reads mapped:

for file in $(ls *bam)

#####Index bam file
samtools index $file

#####Keep only read mapped
sammtols view -F 4 $file -b > temp.bam

mv temp.bam $file

A good website to understand SAM/BAM flags is

VCF files

We can use samtools to obtain a vcf file (Danecek et al. 2011 ) for all MXL samples above or we can use the SNP data of the 1000 Genomes Project.

Here, we use the data from the 1000 Genomes Project.

All VCF files can be obtained from this page

We choose a region, for example chr1:161400000-161700000 flanking the FCGR3 gene to identify tagSNPs.

Download and index the vcf file by using "tabix" and "bgzip" command lines in the tabix tool (Li, 2011 )

tabix -h 1:161400000-161700000  > chr1.161400000.161700000.vcf

Index the file:

bgzip chr1.161400000.161700000.vcf -c > chr1.161400000.161700000.vcf.gz

tabix -p vcf chr1.161400000.161700000.vcf.gz
  • P. Danecek, A. Auton, G. Abecasis, C.A. Albers, E. Banks, M.A. DePristo, R.E. Handsaker, G. Lunter, G.T. Marth, S.T. Sherry, others, (2011) The variant call format and VCFtools. Bioinformatics 27 (15) 2156-2158
  • H. Li, B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, others, (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25 (16) 2078-2079
  • Heng Li, (2011) Tabix: fast retrieval of sequence features from generic TAB-delimited files. Bioinformatics 27 (5) 718-719
You can’t perform that action at this time.