TeXP is a pipeline to evaluate the transcription level of transposable elements in short read RNA-seq data

#About TeXP is a pipeline for quantifying abundances of Transposable Elements transcripts from RNA-Seq data. TeXP is based on the assumption that RNA-seq reads overlapping Transposable Elements is a composition of pervasive transcription signal and autonomous transcription of Transposable Elements.

How to quickly run TeXP

docker run -it fnavarro/texp:latest /bin/bash

Download a fastq file from a RNA-seq experiment, for example, MCF-7 from the ENCODE project

  wget -c -t0 "" -O file.fastq.gz

Run TeXP

  ./ -f file.fastq.gz -t 1 -o process/example/ -n quick_texp_run

The output files will be generated at:

  ls process/example/quick_texp_run
	*.L1HS_hg38.count (Naive counts) 
	*.L1HS_hg38.count.corrected (Corrected counts)
	*.L1HS_hg38.count.rpkm (Naive RPKM)
	*.L1HS_hg38.count.rpkm.corrected (Corrected RPKM)

TIPS: If fastq files are stored locally you can use

docker run -it -v ~/Desktop/:/texp fnavarro/texp:latest /bin/bash

To mount "~/Desktop" at your docker container


  • Bowtie2 (2.3+)
  • Bedtools (2.26+)
  • Fastx-toolkit (0.0.14+)
  • perl (5.24+)
  • python (2.7)
  • R (3.3+)
  • Penalized package (0.49+)
  • samtools (1.3+)
  • wgsim (a12da33 on Oct 17, 2011)

Download TeXP

$> git clone

Edit and Update INSTALL_DIR variable to the path where TeXP was cloned

Installing TeXP dependencies

apt-get update

  • Install binaries dependencies

apt-get install -y

  • Install Wgsim

mkdir -p /src; \ cd /src ;
git clone;
cd wgsim;
gcc -g -O2 -Wall -o wgsim wgsim.c -lz -lm;
mv wgsim /usr/bin/;
cd /;

  • Download Libraries

Fix path (/data/library) to the a proper location at your computation enviroment

mkdir -p /data/library/rep_annotation;
cd /data/library/rep_annotation;
wget -c -t0 "" -O rep_annotation.hg38.tar.bz2;
tar xjvf rep_annotation.hg38.tar.bz2;
rm -Rf rep_annotation.hg38.tar.bz2

mkdir -p /data/library/bowtie2;
cd /data/library/bowtie2;
wget -c -t0 "" -O bowtie2.hg38.tar.bz2;
tar xjvf bowtie2.hg38.tar.bz2;
rm -Rf bowtie2.hg38.tar.bz2

  • Install R packages dependencies

echo 'install.packages(c("penalized"), repos="", dependencies=TRUE)' > /tmp/packages.R
&& Rscript /tmp/packages.R

TeXP config

A few paramaters must be setup so TeXP can properly work outside a docker enviroment; Parameters are set on and the user MUST properly set it up.

  • LIBRARY_PATH: Absolute path pointing to TeXP library, general this is the path you downloaded TeXP
  • EXT_LIBRARY_PATH: Absolute path containing the bowtie2 reference index and Transposable element annotation bed file, downloaded as instructed above
  • EXE_DIR: If binaries are found in a single path, EXE_DIR can be used to generalize binary location. For example, if bowtie2, bedtools, etc are located at /usr/bin/, you should set EXE_DIR := /usr
  • Dependencies installed in different paths should be defined manually, for example, if wgsim is installed at the home folder, the user must set:
    • WGSIM_BIN := ~/wgsim/bin/wgsim
  • Finally the user must set to CONFIGURED := TRUE

Docker image

Alternatively, docker images containing all dependencies and libraries can be used. The TeXP docker image also is pre-configured to work outside the box. Check for futher instructions: docker pull fnavarro/texp

Running TeXP

$> ./ -f [FILE_NAME] -t [INT] -o [OUTPUT_PATH] n [SAMPLE_ID]

-f: Input file (fastq,fastq.gz,sra)

-t: Number of threads

-o: Output path (i.e. ./ or ./processed)

-n: Sample name (i.e. SAMPLE01)

FAQ - Frequently Asked Questions

  1. Does TeXP work for paired end data?

TeXP has been implemented to run one fastq file at a time. Overall, we empirically find that if the RNA-seq library is good, P1 and P2 should yield very similar estimates. Therefore, if using paired-end RNA-seq data, we recommend calculating the mean between both pairs.

  1. Does TeXP work for unstranded data?


  1. Can I use other aligners

On (figure 15) [] we show that aligners do not drastically change TeXP estimations, therefore, while you could use other aligners, we suggest using bowtie2 since all TeXP parameterization has been done on bowtie2


