Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Analysis code accompanying the paper "Illumina TruSeq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements." PLoS One. (2014).
Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Type||Name||Latest commit message||Commit time|
|Failed to load latest commit information.|
# de novo genome assembly run_wgs.sge -- command used to run CA wgs_specfile.spec -- specfile upon which CA depends mol-combined-6.frg -- frg file that gives additional parameters and points to fastq used for assembly run_minimus2.sh -- script used to run minimus2 # analysis of long-read data extract_deletions.sh -- script that reformats CIGAR string and calls perl script to calculate positions of deletions in alignments generate_deletion_profile.pl -- script required by extract_deletions.sh to get deletion positions from CIGAR string # analysis of assembly alignment_stats.sh -- commands used to generate alignment results for Table 2 te_glm.R -- R script to generate GLM and GLMM to test features of TEs important for assembly te_data.csv -- data from FlyTE database used as predictors in GLM/GLMM # assemble down-sampled datasets downsample_melanogaster.sh -- script to perform downsampling, generating a fastq file and frg file for input to the Celera Assembler # see assess-assembly repository for script to assess presence/absence of genomic features (i.e. the response variable for the GLMM)