Skip to content
This is the repo dedicated to Fasciola transposable elements
HTML R
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
DeSeq2_res
Repeat_modeller_out
Repeat_modeller_stats
kallisto_b
sleuth_res
some useful pictures
DeSeq2.R
ERR577157_1_fastqc.html
ERR577157_2_fastqc.html
README.md
SraRunInfo.csv
all_same_3.csv
fastqc_report_5_1P.html
fastqc_report_5_2P.html
fastqc_report_6_1P.html
fastqc_report_6_2P.html
lab_notebook.md
same.csv
sleuth.R

README.md

Identifying differentially expressed transposons across four life-cycle stages of Fasciola hepatica

Introduction

Transposable elements (TEs) are highly repetitive mobile sequences, which play diverse roles in genome regulation. As well, it is expected that TEs participate in lncRNAs function. In trematodes, lncRNA might be involved in development processes and life cycle regulation. For future studies it is significant to explore connection between TEs expression and developmental stages.

Mission

Our study is devoted to detecting transposons in transcriptomes of four different life-cycle stages of F. hepatica and analyzing their differential expression across stages.

Goals

  1. Create mapping of RNA-seq data onto a list of transposons sequences
  2. Quantify expression from mappings for each transposon
  3. Normalize TE data using statistical approach
  4. Discover differentially expressed TEs on each stage of F. hepatica life cycle

Methods

F. hepatica RNA-seq data

For our purposes we used public data which can be found in SRA NCBI archive.
Acsessions: ERX535560, ERX535563, ERX535561, ERX535360, ERX535363, ERX535362, ERX535371, ERX535370.
Info table: SraRunInfo

List of F. hepatica TEs

As well we applied fasta file with repeatitive elements of F.hepatica genome, created earlier with help of RepeatMasker tool.

Project workflow

Short outline of our work: workflow

First of all, row reads of 4 developmental stages were downloaded and then filtered using Trimmomatic and FastQC for quality control. Differential expression analysis was performed with two separate methods: kallisto + sleuth and TEtools + DeSeq2. To analyze any TEs that show change in expression across the different stages, likelihood ratio test (LRT) was implemented. In the end lists of significant TEs (FDR>0.05), which are differentially expressed across developmental stages, was obtained. All steps can be found in lab notebook

System requirements

  • python v.3.6
  • R v.3.4.4
  • Trimmomatic v.0.36
  • FastQC v.0.11.7
  • kallisto v.0.44.0
  • sleuth R package v.0.29.0
  • TEtools v.3
  • DeSeq2 R package v.1.18.1

Results

Both approaches to DE analysis show similiar results. All results can be found in:
sleuth
DeSeq2

For example, PCA
PCA shows that all stages clearly differ from each other, but the difference between two biological replicates is not very significant.

S-t-s distance
Sample-to-sample distances heatmap based on euclidean distance illustrates relations between samples.

heatmap
Heatmap demonstrates difference in expression pattern of the most significant TEs.

boxplot
The difference in the expression level is also noticeable for a particular TE.

Supervisors

Anna Solovyeva, Nickolay Panuyshev

References

  1. Vasconcelos EJR et al, 2017. The Schistosoma mansoni genome encodes thousands of long non-coding RNAs predicted to be functional at different parasite life-cycle stages. Sci Rep. 2017 Sep 5;7(1):10508. doi: 10.1038/s41598-017-10853-6.
You can’t perform that action at this time.