-
Notifications
You must be signed in to change notification settings - Fork 88
Description
Had a request to support CodingQuarry: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-015-1344-4. I've gotten it installed - which was fairly straightforward, although required a version of GCC > 4.8. I've also run into a few segmentation faults for unknown reasons. However, I have also managed to run it with a small number of tests thus far. Since it requires RNA-seq, or rather transcripts aligned to the genome in GFF format, and the documentation isn't obvious on what it is doing or whether it works for higher eukaryotes or only fungi?, I'm hesitant to include it into funannotate. I also would like to limit dependencies - as it is hard enough to install as it is. Of course if turns out to be useful and users would like it included, would be relatively straightforward to have funannotate run this if RNA-seq BAM file is passed to funannotate predict.
I did however, run it and the results can be passed to funannotate like the following.
- I aligned PE stranded RNA-seq data using
hisat2and then converted and sorted the BAM file using samtools.
hisat2 -x genome -p 8 reads_R1.fq reads_R2.fq | samtools view -bS - | samtools sort -o sorted.bam
- then ran StringTie, which will output a GTF file.
stringtie -o strin_out.gtf -p 8 sorted.bam
- convert GTF to GFF3 using funannotate script
funannotate util stringtie2gff3 -i string_out.gtf > string_out.gff3
- then run CodingQuarry using default settings
CodingQuarry -t string_out.gff3 -f genome.fasta -p 8
- Convert malformed GFF3 output to proper GFF3 format.
funannotate util quarry2gff3 -i out/PredictedPass.gff3 > coding-quarry.gff3
- finally can pass this to
funannotate predictlike following, note I gave it a weight of 5 for EVM with the:5option:
funannotate predict -i genome.fasta -o test \
--other_gff coding-quarry.gff3:5 \
--cpus 12 -s "Awesome genome"