Skip to content

🌲 Annotate the plastid genome of Engelmann spruce (Picea engelmannii)

Notifications You must be signed in to change notification settings

bcgsc/picea-engelmannii-plastid

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Plastid genome of Engelmann spruce (Picea engelmannii), genotype Se404-851

🌲 Annotation of the plastid genome of Engelmann spruce (Picea engelmannii), genotype Se404-851 https://www.ncbi.nlm.nih.gov/nuccore/MK241981

Se404-851 chloroplast genome

GeSeq

The assembled FASTA file was inputed into GeSeq.

EMBOSS Seqret

The .gb file is converted to a .gff file using EMBOSS Seqret:

Manual Annotation

Duplicate annotations made by GeSeq were removed. Most conflicts were due to the use of multiple reference annotations, as the whole taxus was selected. tRNAs with anti-codons annotated were selected over those without them. Further annotation was done using third party tRNA scanners like ARAGORN and tRNAscan. However, no further results were substantiated through their use.

Four genes specifically needed to be manually annotated: rps12, petB, petD, and rpl16. Rps12 is a transpliced gene, while the other three had such short inital exons that GeSeq could not annotate them. Not all mRNAs were successfully annotated by GeSeq, and were consequently manually annotated. In the final annotation, it is demonstrated that all 114 genes found in the other Picea chloroplasts are consered, including the 74 coding regions (CDS), 4 rRNAs, 36 tRNAs, and 15 introns (9 in CDS, 6 in tRNAs).

MUMmer and minidot

MUMmer and minidot were used to find inverted repeats that ultimately did not make it into the final annotation.

BLASTn

BLASTn was used to align Se404-851cp genes to PG29 genes. Every subpar alignment was then analyzed in IGV along with whole chloroplast genome alignments as well as read to assembly alignments. Most discrepancies between PG29cp and Se404-851cp were supported by the reads; those that were not were fixed as indicated by the consensus.

table2asn_GFF

This final GFF annotation was validated using table2asn_GFF:

OGDraw

The .gbf file, generated by table2asn_GFF was fed through OGDraw independently

Citation

Lin D, Coombe L, Jackman SD, Gagalova KK, Warren RL, Hammond SA, McDonald H, Kirk H, Pandoh P, Zhao Y, Moore RA, Mungall AJ, Ritland C, Doerksen T, Jaquish B, Bousquet J, Jones SJM, Bohlmann J, Birol I. 2019. Complete Chloroplast Genome Sequence of an Engelmann Spruce (Picea engelmannii, Genotype Se404-851) from Western Canada. Microbiol Resour Announc 8:e00382-19. doi: 10.1128/MRA.00382-19.

About

🌲 Annotate the plastid genome of Engelmann spruce (Picea engelmannii)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published