-
Notifications
You must be signed in to change notification settings - Fork 0
leofountain/Weaver
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Weaver Allele specific base-pair resolution quantification of Strcutrual variations in cancer genome yangli9@illinois.edu leofountain@gmail.com Version 0.20 ---------------------------- INSTALL ---------------------------- Bamtools (https://github.com/pezmaster31/bamtools) libraries are needed included in Weaver_SV/lib and Weaver_SV/inc export LD_LIBRARY_PATH=<PREFIX>/Weaver/Weaver_SV/lib/:$LD_LIBRARY_PATH libz required //-lz flag Parallel::ForkManager (http://search.cpan.org/~szabgab/Parallel-ForkManager-1.06/lib/Parallel/ForkManager.pm) perl package is needed Bedtools (https://github.com/arq5x/bedtools) Samtools (http://samtools.sourceforge.net/) BOOST C++ library (http://www.boost.org/) BWA (http://bio-bwa.sourceforge.net/) Bowtie (http://bowtie-bio.sourceforge.net/index.shtml) 1 Modify the required BOOST directory in src/Makefile 2 ./INSTALL.sh ----------------------------- DATA ----------------------------- wget http://bioen-compbio.bioen.illinois.edu/weaver/Weaver_data.tar.gz ----------------------------- EXAMPLE DATA ----------------------------- wget http://bioen-compbio.bioen.illinois.edu/weaver/Weaver_example.tar.gz RUN: Weaver PLOIDY -f SIMU.fa -S FINAL_SV -s SNP -g REGION -w X.bam.wig -r 0 -m map100mer.bd -p 64 solo_ploidy TARGET 2 Weaver LITE -f SIMU.fa -S FINAL_SV -s SNP -g REGION -w X.bam.wig -r 0 -m map100mer.bd -p 64 -t 20 -n 0 ---------------------------- Weaver_SV.pl ---------------------------- SV finding Input: BAM file from BWA Output: VCF file for SV ---------------------------- Weaver_pipeline.pl ---------------------------- Master program: 1 Generate SV 2 Generate other inputs needed for Weaver INPUTS DATA package: 1000 Genomes Project Phase 1 haplotypes ---------------------------- Weaver ---------------------------- Core PGM program INPUTS: 1 SV Outputs: 1 Purity and haploid-level sequencing coverage 2 Allele specific copy number of genomic regions 3 Allele specific copy number of structural variations 4 Relative timing of structural variations 5 Cancer scaffolds 5 Phasing of germline SNPs in CNV regions ---------------------------- Weaver_lite ---------------------------- Core PGM program, with SNP phasing disabled to speed up INPUTS: 1 SV 2 reference 3 Mappability (available for hg19) 4 Region (available for hg19) 5 wig (from bam) ---------------------------- Weaver PLOIDY ---------------------------- Weaver PLOIDY -f -S -s ../SNP_dens -g GAP_20140416_num -w -r 1 -m -p 16 INPUTS: -f reference file (fasta), should match the reference used in original bam file. Especially for most TCGA datasets, the alignment was performed on //www.broadinstitute.org/ftp/pub/seq/references/Homo_sapiens_assembly19.fasta, which does not have "chr" prefix [MANDATORY] -S SV file, with format consistent with Weaver_SV. [MANDATORY] -s SNP file, with ref and alt mappings [MANDATORY] -w wig file from bam, storing the coverage information [MANDATORY] -r 1, if first time running (generating temp files); 0 if want to use existing temp files. [default 1] -m mappability file, download from http://bioen-compbio.bioen.illinois.edu/weaver/Weaver_data.tar.gz [MANDATORY] -p number of cores [default 1] ----------------------------- FILE FORMAT DECLARITIONS ----------------------------- Wiggle file: Wiggle file need to be declared with fixedStep, step 1 and span 1 fixedStep chrom=chr1 start=9994 step=1 span=1 if a chromosome has multiple declaration lines, they need to be sorted based on position: fixedStep chrom=chr1 start=9994 step=1 span=1 X X X fixedStep chrom=chr1 start=100 step=1 span=1 X X X Is not allowed Bam file: Must be sorted and indexed. SNP file: NGS SNP link file 1KGP SNP link SV: Genome region file: GAP regions in assembly are annotated. ################### Output: ################### REGION_CN_PHASE: storing phased allele specific copy number of genome CHR BEGIN END ALLELE_1_CN ALLELE_2_CN SV_CN_PHASE: Structural variation copy number and phasing, catagory CHR_1 POS_1 ORI_1 ALLELE_ CHR_2 POS_2 ORI_2 ALLELE_ CN germline/somatic_post_aneuploidy/somatic_pre_aneuploidy ############### CONTACT ############### Yang Li Ma Lab Bioengineering Dept., University of Illinois at Urbana-Champaign yangli9@illinois.edu https://github.com/leofountain/Weaver
About
Allele-Specific Quantification of Structural Variations in Cancer Genomes
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published