Skip to content

comaecliptic/pogonophora

Repository files navigation

De novo assembly and analysis of pogonophore Siboglinum fiordicum transcriptome at different larval stages


Aim of the project:

To study the molecular basis for the segmentation of Siboglinum fiordicum using transcriptomic data from different stages of the life cycle.

Objectives

  • De novo transcriptome assembly of a non-model organism
  • Expression analysis at different stages of development

Methods

  • RNAseq libraries from 3 trochophores (before and after septum formation) and an adult organism were analyzed
  • Primary quality control and raw data preparation were performed with FastQC, Karect, fastp
  • De novo assembly with Trinity
  • Sequence clusterization (CD-HIT-EST)
  • Estimation of the completeness (BUSCO) and quality (TransRate) of assembly, filtering contigs with low scores
  • Determinaion of possible contamination by analyzing ribosome subunits sequences (RNAmmer from Trinotate pipeline) and transcriptome composition (BlobTools), filtering ribosomal, prokaritic (bacterial and archaeal), protists', vertebrate sequences
  • Expression quantification with Salmon
  • The determination of encoded amino acid sequences using a two-step analysis of TransDecoder
  • Annotation (NCBInt, NCBInr, SwissProt, PfamA and eggNOG databases)
  • Co-expression clusters building (Clust)
  • Construction of orthogroups using OrthoFinder and filtered reference sets of proteins from two other Annelida species: Capitella teleta (UniProt ID: UP000014760) and Helobdella robusta (UP000015101)
  • Pathway enrichment analysis (GeneOntology, using topGO) of “genes” with predominant expression at a particular stage of the cycle

Results

  • We prepared a reference set of 29032 protein-coding sequences with significant expression (defined as sequences that has >=1 Transcript-Per-Million in at least one library and coding protein with length >=100 amino acid residues)
  • After the expession analysis 18222 (63%) sequences have been found at all stages while 2914 (10%), 2067 (7%), 8373 (29%) and 2596 (9%) sequences demonstrate predominant expression at the stages 3, 4, 5 of trochophores and adults, respectively Venn diagramm of expression
  • 11 co-expression clusters from 114 to 3193 in size were constructed Co-expression clusters visualization
  • 13745 orthogroups were built heat map of orthogroups
  • 903 GO-terms were “enriched”, selecting only terms with >=10 sequences with significant expression, among them:
GO-term life cycle stage p-value
animal organ development aT4, aT5 0.00608, 0.00141
mesoderm morphogenesis aT3 0.00131
regionalization aT5 0.00332
cell proliferation aT5 5.4e-06
response to bacterium Adult 2.6e-06

References

Articles

Databases

Tools

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published