Skip to content

4.2. Workaround if you do not have biological replicates (version 2.0.2 )

Qingqing Wang edited this page Sep 6, 2018 · 3 revisions

If you do not have biological replicates for your samples, here is a workaround (the strategy is to simulate a biological read count file of the junctions based on the real dataset, in each condition):

  1. When running JUM_A.sh, set the --Condition1_fileNum_threshold and --Condition2_fileNum_threshold parameters to 1. for example:

    $ bash /user/home/JUM_2.0.2/JUM_A.sh --Folder /user/home/JUM_2.0.2 --JuncThreshold 5 --Condition1_fileNum_threshold 1 --Condition2_fileNum_threshold 1 --IRthreshold 5 --Readlength 100 --Thread 3 --Condition1SampleName ctrl --Condition2SampleName treat

    Since you only have one sample under each condition, it is recommended to be more stringent about junction filtering in JUM_A.sh, as you do not have extra replicates to provide buffer for junction quality reassurance. For example, here we set --JuncThreshold to be 10 instead of 5, so as to make sure the junctions passing to JUM are real biological ones but not random noise. You can set this number to be even higher, based on your datasets.

  2. Before you run the Rscript (step 2):

    • Run the script vary_for_replicate.pl in the JUM package as follows:
     $ perl /user/home/JUM_2.0.2/vary_for_replicate.pl ctrl_combined_count.txt > temp_ctrl_count.txt
     $ perl /user/home/JUM_2.0.2/vary_for_replicate.pl treat_combined_count.txt > temp_treat_count.txt
     $ less temp_ctrl_count.txt | cut -f1,3 > ctrlRep_combined_count.txt
     $ less temp_treat_count.txt | cut -f1,3 > treatRep_combined_count.txt
     $ rm temp_ctrl_count.txt
     $ rm temp_treat_count.txt
    • You also need to edit the experiment_design.txt file as follows:
          condition
    ctrl control
    ctrlRep control
    treat treatment
    treatRep treatment
  3. When running JUM_B.sh, set the --TotalFileNum parameter to be the total # of samples including the simulated ones. In the example here, 4. Also, set the --Condition1_fileNum_threshold and --Condition2_fileNum_threshold to be 1.

    for example:

    $ bash /user/home/JUM_2.0.2/JUM_B --Folder /user/home/JUM_2.0.2 --Test pvalue --Cutoff 0.05 --TotalFileNum 4 --Condition1_fileNum_threshold 1 --Condition2_fileNum_threshold 1 --Condition1SampleName ctrl --Condition2SampleName treat
  4. When running JUM_C.sh, set the --TotalCondition1FileNum parameter to be the total # of control samples including the simulated ones. In the example here, 2. Similarly, set the --TotalCondition2FileNum parameter to be the total # of treated samples including the simulated ones. In the example here, 2.

    for example:

    $ bash /user/home/JUM_2.0.2/JUM_C.sh --Folder /user/home/JUM_2.0.2 --Test pvalue --Cutoff 0.05 --TotalCondition1FileNum 2 --TotalCondition2FileNum 2 --REF refFlat.txt