Skip to content

COrrecting CO-Occuring multi-Nucleotide variation

Notifications You must be signed in to change notification settings

ding-lab/COCOONS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

COrrecting CO-Occuring multi-Nucleotide variant (COCOON) in Mutation Annotation Format (MAF) file

Usage: perl cocoon.pl MAF OUTPUT <OPTIONS>


REQUIRED:
	MAF			MAF file to be inspected
	OUTPUT			Output file name (used as prefix for merged file in MERGING mode)

OPTIONS:
	--distance		Maximum distance between two mutations to be considered [default: 5]
	--mapq			Minimum read mapping quality; used if given bam [default: 10]
	--snvonly		Only SNVs are considered [default: no]
	--freq_cooccur		Minimum cooccuring frequency of the adjacent mutations; used if given bam [default: 0.8]
	--vaf_diff		Maximum variant allele frequency (VAF) difference to define DNP; used if not given bam [default:10]
	--bam			File containing bam file locations for the samples used in MAF (format:SAMPLE<TAB>BAM_LOCATION in each line)
	--merge			Merge the adjacent mutations (MERGING mode) or not [default: no]
	--genome		Human genome sequence (fasta) [required in MERGING mode]
	--gtf			Ensembl human annotation (gtf) [required in MERGING mode]
	--help			Print this help message

*** Questions & Bug Reports: Qingsong Gao (qingsong.gao@wustl.edu)


EXAMPLES:

1) report MNVs based on read depth in MAF file (SNVs only)

  command: perl cocoon.pl Test_datasets/test_dataset1.maf Test_datasets/test_dataset1.out --snvonly


2) report MNVs based on read depth in MAF file (including indels by default) 

  command: perl cocoon.pl Test_datasets/test_dataset2.maf Test_datasets/test_dataset2.out


3) report MNVs based on read depth in MAF file and merge them 

  note: --genome and --gtf are required in MERGING mode (--merge)
	--distance 2 is used since variants with distance > 2 will not affect the same codon; if not specified, variants with distance > 2 will still be ignored for merging
      # Test_datasets/test_dataset3.fasta is human chr10, which can be downloaded from ftp://ftp.ensembl.org/pub/release-75/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.75.dna.chromosome.10.fa.gz

  command: perl cocoon.pl Test_datasets/test_dataset3.maf Test_datasets/test_dataset3.output --distance 2 --snvonly --vaf_diff 10 --merge --genome Test_datasets/test_dataset3.fasta --gtf Test_datasets/test_dataset3.gtf


4) report MNVs based on mapping reads bam files

  command: perl cocoon.pl Test_datasets/test_dataset4.maf Test_datasets/test_dataset4.output --bam Test_datasets/test_dataset4.bamlist

About

COrrecting CO-Occuring multi-Nucleotide variation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages