splitPrimariesHaplotigs.sh myUnzipAsm.fasta
Default setting is to remove contigs with fewer than 50% of their bases polished. During arrow polishing, base positions with fewer than 5 raw reads mapped will not have an updated consensus base called, but rather, will have the reference base returned in lowercase. This script filters out contigs that have low raw read support.
trimLowercaseContigs.py myPolishedAsm.fasta myoutfile.fasta
For Unzip Assemblies
alignHaplotigs2Primary.sh myUnzipAsm.fasta primaryContigID
For FALCON Assemblies
alignAssoc2Primary.sh myFALCONAsm.fasta primaryContigID
Assembly file must contain both primary contigs and haplotigs or associated contigs.
Primary contig ID formatted as: 000123F.
view filtered delta file (primaryContigID.g.delta) in assemblytics or mummerplot.
python2
removeNestedHaplotigs.sh myUnzipAsm.fasta
This shell script aligns all haplotigs against their primary contig using nucmer, then identifies haplotigs that are nested within other haplotigs using nestedHaplotigs.py. Understanding why these small haplotigs are assembled is an area of ongoing research, but users may wish to remove these haplotigs from their assembly.
Outputs contig ID, mean quality across all bases, and contig length
fastq2meanQ.py myAsm.fastq
Fastq sequence and quality lines must not be wrapped
python2.7.x
python get_homologs.py --nproc 24 /path/to/primary_contigs.fasta
The output is a directory of mummerplot 'Qfiles' listing homologous contigs, along with a plots.sh script that can be run independently to draw dotplots for each reference contig with all homologous queries aligned to it.