PIDFE, P-element Insertion Detector and Frequency Estimator, is a pipeline to detect P-element insertions and estimate their insertion frequencies from paired-end reads.
Copyright (c) 2019 Kelleher Lab at the University of Houston. If you used PIDFE in your study, please cite: Zhang S, Pointer B, Kelleher E. 2020. Rapid evolution of piRNA-mediated silencing of an invading transposable element was driven by abundant de novo mutations. Genome Res. 30: 566-575
Current version v2.0
PIDFE identifies P-element insertions by detecting the split reads (read-1 in above figure), one part of which is mapped to P-element and the remaining part is mapped to the reference genome.
The following codes install PIDFE in home directory. Users can install it anywhere they want by changing '~' to their desired directory.
cd ~
git clone
chmod 755 -R ~/PIDFE/scripts
echo 'export PATH=$PATH:~/PIDFE/scripts' >> ~/.bash_profile
PIDFE is designed to run on a high performance computering platform with Linux operating system. The following softwares or packages are required to run PIDFE.
sh [-i inDir] [-o output] <-g refGenome> <-p refP>
Required arguments:
<-g refGenome> defines the reference genome where P-elements reside.
<-p refP> defines the consensus P-element sequence.
Optional arguments:
[-i inDir] is the directory containing paired-end read, i.e., *_R1.fastq and *_R2.fastq. Asterisk(*) can represent zero or any number of characters. Default: current working directoary
[-o output] is the output file name. Default: p_insertions.txt
The output file contains identified P-element insertion (1st column), the number of reads supporting each insertion (2nd column), and estimated frequencies (3rd column). Here is an example:
insertion | supporting_reads | frequency |
chr2L:12822166:+ | 27 | 0.844 |
chr2L:20311155:- | 38 | 1 |
chr2L:3632292:+ | 9 | 0.4285 |
chr2L:3632292:- | 13 | 0.52 |
For each insertion, the inserted chromosome, insertion breakpoint, and orientation of the insertion are reported. For example, "chr2L:12822166:+" indicates that there is a sense P-element insertion at chromosome 2L at genomic coordinate 12822166. Sense ('+') represents the P-element insertion is in the same orientation as the plus reference genomic strand, whereas antisense ('-') represents the P-element insertion is in the same orientation as the minus reference genomic strand.
If you find any bugs or have difficulties using PIDFE, please feel free to contact Shuo Zhang (