Run rMATS for 3 samples/conditions at once #96

mariakondili · 2021-03-05T13:09:34Z

Hello! This is not an issue but more of a question on functionalities or rMATS.
I d like to know if it's possible to run for comparing the splicing events of WT vs Treatment1 vs Treatment2, all at once,
and obtain a table with Incl_Level_WT, Incl_Level_Treatment1, Incl_Level_Treatment_2, and the 2 respective Incl_Differences.
Something like having a --b3 option, for .bam files.

I saw that when I run separately WT-vs-Treat1, WT-vs-Treat2 , the Incl_Level_WT, of these two tables is not same for some events.
Moreover, I have to do data mining to find the genes and events that are common on 1st and 2nd comparison, but we are more interested in having the values of splicing for each exon along all 3 conditions.
Is there a way to run rMATS with existing options in a way that I obtain the table I want , by using only --b1, --b2 ?
e.g --b1 = WT_1,WT_2, WT_3 , --b2 Treat1a, Treat1b, Treat1c ==>and then,
the results of these over --b2 Treat2a, Treat2b, Treat2c ?
Thanks in advance for any suggestion

EricKutschera · 2021-03-05T13:38:43Z

You can first process all of the input bams with one or more --task prep steps. Then combine all of the prep output into a single --task post step. Finally do two separate runs of --task stat, one for WT vs Treatment1 and another for WT vs Treatment 2. You'll get a set of output files for each run of --task stat, but they will be based on the same set of event definitions from the --task post step so you can match up the rows easily

Here is the section of the README that is related to this: https://github.com/Xinglab/rmats-turbo/tree/v4.1.1#running-the-statistical-model-separately

mariakondili · 2021-04-07T13:05:16Z

Thanks for the help ! I got it ! For the record, briefly this is what I ran:
#PREP_step_1 : python rmats.py --b1 ${group1} --b2 ${group2} --tmp ${outDir_prep}/tmp_1/ --task prep
#PREP_step_2: python rmats.py --b1 ${group1} --b2 ${group3} --tmp ${outDir_prep}/tmp_2/ --task prep
#POST_step : python rmats.py --b1 ${group123} --tmp ${outDir_prep}/combo_tmp/ --task post
#STATS_step_1 :
python prepare_stat_inputs.py --new-output-dir ${outDir_post}/out_gr1_gr2 --old-output-dir ${outDir_post} --group-1-indices 0,1,2 --group-2-indices 3,4,5
python rmats.py --od ${outDir_post}/out_gr1_gr2 --tmp ${outDir_post}/out_gr1_gr2/tmp --task stat

#STATS_step_2 :
python prepare_stat_inputs.py --new-output-dir ${outDir_post}/out_gr1_gr3 --old-output-dir ${outDir_post} --group-1-indices 0,1,2 --group-2-indices 6,7,8
python rmats.py --od ${outDir_post}/out_gr1_gr3 --tmp ${outDir_post}/out_gr1_gr3/tmp --task stat

The ID column for the post is same in Stat-step so, we can really nicely follow up the events and their values.

macsalvin · 2021-05-24T20:53:54Z

Hi @EricKutschera,
I would like to compare three groups (as you mention here ), in my case KT vs KV and KT vs QT.
As a final result I would like to have the total number of SE events (I mention just SE as an example) for KT and then the differential SE events for both KT vs KV and KT vs QT. As you can see in Fig.1 and Fig.2 the rows for KT are different for the two comparisons and so the sum for getting the total events (Why there are different rows?). How can I get the total SE events for KT? Should I get it before running the statistical part?

Fig.1 (row 7 is missing)

Fig.2 (rows 0,4,5,6 are missing)

Here all my steps:

Running prep and post separately

Running prep
./run_rmats --b1 ALLSAMPLE.txt --gtf Mus_musculus.GRCm38.92.chr.gtf --readLength 150 --nthread 28 --od output --tmp tmp_output_prep --task prep

Running post
./run_rmats --b1 ALLSAMPLE.txt --gtf Mus_musculus.GRCm38.92.chr.gtf --readLength 150 --nthread 28 --od outputPOST --tmp tmp_output_prep --statoff --task post

Extract information for a specific comparison

KTKV
python rMATS_P/prepare_stat_inputs.py --new-output-dir outputKTKV --old-output-dir outputPOST --group-1-indices 0,1,2,3,4 --group-2-indices 5,6,7
KTQT
python rMATS_P/prepare_stat_inputs.py --new-output-dir outputKTQT --old-output-dir outputPOST --group-1-indices 0,1,2,3,4 --group-2-indices 8,9,10,11,12,13

Running the statistical part

KTKV
./run_rmats --od KTKV --tmp tmpKTKV --task stat

QTQV
./run_rmats --od QTQV --tmp tmpQTQV --task stat

Thank you so much for your help

EricKutschera · 2021-05-25T12:59:30Z

When the statistical model is run (--task stat), the count files are first filtered to remove rows where there is not enough information (zero counts). Here is the code: https://github.com/Xinglab/rmats-turbo/blob/v4.1.1/rmats.py#L322

It filters out a row unless:

group 1 has at least one count (either inclusion or skipping)
group 2 has at least one count (either inclusion or skipping)
at least one of the two groups has at least one inclusion count
at least one of the two groups has at least one skipping count

The count files were not filtered in the post step since you ran that step with --statoff. You can look at the count files in outputPOST to see the rows that were filtered out in --task stat

macsalvin · 2021-05-25T13:39:12Z

Thank you very much @EricKutschera

mariakondili closed this as completed Apr 7, 2021

EricKutschera mentioned this issue Feb 22, 2022

analysis for multiple grouping and samples with mixed single/paired end reads #180

Open

EricKutschera mentioned this issue May 27, 2022

Behaviour of fixed-event-set argument #202

Closed

orbitalse mentioned this issue Apr 6, 2023

Running rMATS for more than 2 samples for pairwise differential splicing analysis #282

Closed

EricKutschera mentioned this issue Sep 8, 2023

Events not all matched up when all bam files included in same group for prep step #319

Open

EricKutschera mentioned this issue Apr 24, 2024

Comparison of multiple groups in rMATS #398

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run rMATS for 3 samples/conditions at once #96

Run rMATS for 3 samples/conditions at once #96

mariakondili commented Mar 5, 2021 •

edited

Loading

EricKutschera commented Mar 5, 2021

mariakondili commented Apr 7, 2021

macsalvin commented May 24, 2021

EricKutschera commented May 25, 2021

macsalvin commented May 25, 2021

Run rMATS for 3 samples/conditions at once #96

Run rMATS for 3 samples/conditions at once #96

Comments

mariakondili commented Mar 5, 2021 • edited Loading

EricKutschera commented Mar 5, 2021

mariakondili commented Apr 7, 2021

macsalvin commented May 24, 2021

EricKutschera commented May 25, 2021

macsalvin commented May 25, 2021

mariakondili commented Mar 5, 2021 •

edited

Loading