Utilities to assemble an organelle genome using ABySS.
The following tools must be installed and available on your PATH:
- ABySS
- samtools
- bwa
- bioawk
- bedtools
- R
- abyss-organelle.mk (main script): do a standard ABySS assembly and then extract organelle scaffolds
- classify.mk: split contigs file into 'organelle' and 'genome' contigs using k-means clustering on coverage, %GC content, and length
- classify.r: R script to do k-means clustering for classify.mk
- cov-hist-to-mean: convert 'bedtools genomecov' output to mean coverage per contig
- fastx2gc: compute %GC content for each seq in a FASTA/FASTQ file
- smartcat: does zcat/bzcat/cat based on file type (thanks to David Bartle)
The main assembly script is abyss-organelle.mk
. Usage of abyss-organelle.mk
is the same as abyss-pe
(see the ABySS README.md), except that it requires an additional parameter:
- readfiles: a list of FASTA/FASTQ/SAM/BAM to align to the scaffolds in order to determine read coverage during the classification step
$ abyss-organelle.mk name=arabidopsis k=50 readfiles='reads1.fa.gz reads2.fa.gz' lib=readfiles
- classify: specify whether to do the classification at the contig stage or the scaffold stage. Possible values are
classify=scaffolds
orclassify=contigs
[classify=scaffolds
] - j: specify number of threads