Skip to content

Latest commit

 

History

History
171 lines (153 loc) · 7.69 KB

Impact of Parameters.md

File metadata and controls

171 lines (153 loc) · 7.69 KB

Impact of Parameters

Among the nine tools we investigated, parameters of five tools, including Dapars, ChangePoint, APAtrap, IsoSCM, and GETUTR, are adjustable. Here we used the human MAQC data set to evaluate the influence of parameters of these tools on the prediction accuracy of poly(A) sites. The accuracy of poly(A) site prediction using different tools with different parameter combinations is shown in Table 1.
Two parameters are adjustable in APAtrap, the minimum distance between two poly(A) sites (option: -a, default is 100) and the average read coverage for a 3’UTR (option: -c, default is 20). Here we test five combinations of -a and -c. The accuracy under different parameter combinations ranges from 28% to 32% with the best at the combination of -a=50 and -c=20.
For DaPars, although two parameters are provided, Coverage_cutoff (default is 30) and PDUI_cutoff (default is 0.5), the accuracy remains the same (27.87%) under six parameter combinations examined (PDUI_cutoff=0.5 and Coverage_cutoff=10/20/30; Coverage_cutoff=30 and PDUI_cutoff=0.1/0.3/0.7).
ChangePoint has two parameters, -n (at least how many reads support each region, default is 20) and -a (mixed directional FDR level, default is 0.05). We test nine parameter combinations and the accuracy ranges from 23% to 26%.
IsoSCM has relatively more parameters adjustable, here we assessed the impact of two main parameters, -min terminal (terminal segments are extended by this amount before segmentation) and -min fold (the minimum fold change between neighboring segments expressed as a ratio of coverage downstream divided bycoverage upstream). Because the computation time of IsoSCM is relatively long, here we only test two parameter combinations and obtained similar accuracy (~13%).
GETUTR has one parameter -m (index for the smoothing method, 10 for PAVA, 0 for Max.fit and 1 for Min.fit). Due to the speed limit of GETUTR, here we used the Arabidopsis data instead to assess the parameter influence. The accuracy ranges from 24% to 27% with the best when using the PAVA smoothing method (-m 10).
Generally, these methods are quite robust to different parameters with slight variation of performance in terms of accuracy.

Table 1 Accuracy of poly(A) site prediction using different tools with different parameter combinations.

Tool Parameter combinations Accuracy (cutoff=100 bp)
APAtrap -a 100 –c 20 28.05%
APAtrap -a 100 –c 30 28.54%
APAtrap -a 100 –c 50 29.96%
APAtrap -a 50 –c 20 32.22%
APAtrap -a 150 –c 20 27.39%
DaPars -PDUI_cutoff 0.5 -Coverage_cutoff 10 27.87%
DaPars -PDUI_cutoff 0.5 -Coverage_cutoff 20 27.87%
DaPars -PDUI_cutoff 0.5 -Coverage_cutoff 30 27.87%
DaPars -PDUI_cutoff 0.1 -Coverage_cutoff 30 27.87%
DaPars -PDUI_cutoff 0.3 -Coverage_cutoff 30 27.87%
DaPars -PDUI_cutoff 0.7 -Coverage_cutoff 30 27.87%
ChangePoint -n 20 –a 0.05 25.85%
ChangePoint -n 20 –a 0.1 23.98%
ChangePoint -n 20 –a 0.3 23.98%
ChangePoint -n 20 –a 0.5 23.98%
ChangePoint -n 20 –a 0.05 24.93%
ChangePoint -n 10 –a 0.05 24.41%
ChangePoint -n 15 –a 0.05 24.16%
ChangePoint -n 30 –a 0.05 23.76%
ChangePoint -n 50 –a 0.05 23.17%
IsoSCM -min terminal 300 -min fold 0.5 13.90%
IsoSCM -min terminal 50 -min fold 0.08 13.38%
GETUTR -m 10 27.32%
GETUTR -m 1 24.97%
GETUTR -m 0 24.29%

The Arabidopsis data set is used for GETUTR, while the human MAQC data set is used for other methods.
If a predicted poly(A) site is within 100 bp of any annotated poly(A) site, then it is considered as a true positive.

Running

  • APAtrap

    • APA site detection using different parameters
    predictAPA.pl –a 100 –c 20 -i Brain.bedgraph Brain_control.bedgraph -g 2 -n 1 1 -u hg19.utr.bed -o output.txt -d 0.1 
    
    predictAPA.pl –a 100 –c 30 -i Brain.bedgraph Brain_control.bedgraph -g 2 -n 1 1 -u hg19.utr.bed -o output.txt -d 0.1 
    
    predictAPA.pl –a 100 –c 50 -i Brain.bedgraph Brain_control.bedgraph -g 2 -n 1 1 -u hg19.utr.bed -o output.txt -d 0.1 
    
    predictAPA.pl –a 50 –c 20 -i Brain.bedgraph Brain_control.bedgraph -g 2 -n 1 1 -u hg19.utr.bed -o output.txt -d 0.1 
    
    predictAPA.pl –a 150 –c 20 -i Brain.bedgraph Brain_control.bedgraph -g 2 -n 1 1 -u hg19.utr.bed -o output.txt -d 0.1 
    
  • DaPars

    • APA site detection using different parameters
    python DaPars_main.py configure_file
     configure_file: 
     Annotated_3UTR=hg19_refseq_extracted_3UTR.bed
     Group1_Tophat_aligned_Wig=Brain.bedgraph
     Group2_Tophat_aligned_Wig=Brain_control.bedgraph
     Output_directory=DaPars_Brain/
     Output_result_file=DaPars_Brain
     Num_least_in_group1=1
     Num_least_in_group2=1
     Coverage_cutoff=30
     FDR_cutoff=0.05
     PDUI_cutoff=0.5
     Fold_change_cutoff=0.59
    

    Change the Coverage_cutoff and PDUI_cutoff parameters

     Coverage_cutoff=10
     PDUI_cutoff=0.5
    
     Coverage_cutoff=20
     PDUI_cutoff=0.5
    
     Coverage_cutoff=30
     PDUI_cutoff=0.1
    
     Coverage_cutoff=30
     PDUI_cutoff=0.3
    
     Coverage_cutoff=30
     PDUI_cutoff=0.7
    
  • ChangePoint

    • APA dynamics detection using different parameters
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d s -n 20 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d l -n 20 –a 0.05
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d s -n 20 –a 0.1
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d l -n 20 –a 0.1
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d s -n 20 –a 0.3
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d l -n 20 –a 0.3
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d s -n 20 –a 0.5
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g 3utr.bed -d l -n 20 –a 0.5
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d s -n 5 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d l -n 5 –a 0.05
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d s -n 10 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d l -n 10 –a 0.05
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d s -n 15 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d l -n 15 –a 0.05
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d s -n 30 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d l -n 30 –a 0.05
    
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d s -n 50 –a 0.05
    perl change_point.pl -t Brain.bam -c Brain_control.bam -g utr.bed -d l -n 50 –a 0.05
    
  • IsoSCM

    • APA site detection using different parameters
     java -Xmx102400m -jar IsoSCM-2.0.11.jar assemble -min terminal 300 -min fold 0.5 -coverage false -bam Brain.bam -base Brain -s unstranded  -jnct alpha 0.05
    
     java -Xmx102400m -jar IsoSCM-2.0.11.jar assemble -min terminal 50 -min fold 0.08 -coverage false -bam Brain.bam -base Brain -s unstranded  -jnct alpha 0.05
    
  • GETUTR

    • APA site detection using different parameters
    python GETUTR.py -i Brain.bam -o Brain.3UTR -m 10 -r refFlat.txt
    
    python GETUTR.py -i Brain.bam -o Brain.3UTR -m 0 -r refFlat.txt
    
    python GETUTR.py -i Brain.bam -o Brain.3UTR -m 1 -r refFlat.txt