[Bug Fix] IsoSCM identification output should be from compare step #429

faricazjj · 2022-09-19T15:15:28Z

Currently, we get results for identification challenge from one of the output files from the assemble step. Initially, this file seems to contain the identified changepoints just by looking at the content of the file.

But as I read further beyond the assemble step, the second step they describe is the compare command. The compare command "reports the differential usage of each identified change-point", which I expected to show the site usages of the same sites as the sites from the previous assemble step shown in the screenshot above. But I see less and also different sites in the compare step output file

This led me to look more into whether we should be extracting sites for identification challenge from the assemble or compare step.

As additional context, the steps to run isoscm are:

run assemble step which creates a tmp folder: isoscm/tmp/{sample}.cp.filtered.gtf
run compare step, this requires the xml file output from assemble step for two samples, but since we're getting site usage per sample, I put the same sample twice as input to obtain the following output:

Even though the first isoscm/tmp/{sample}.cp.filtered.gtf file contains changepoint locations, I'm not entirely sure we should obtain identification output sites from there since it's in a tmp folder and the github readme doesn't explain what the files in the tmp folders are--they only explained the files outside of the tmp folder from assemble step i.e they explained the files from step 2 above but not files from step 1. I think the sites for identification might have to be obtained from the compare step output (i.e. isoscm/compare/{sample}.txt). After reading their readme, I checked their paper and saw that in their paper, they didn't mention assemble step to be where we get identified changpoints. They mentioned from "...Using the “assemble” keyword IsoSCM will assemble the mapped reads in a BAM file into a splice graph, identify nested terminal exons boundaries using the constrained segmentation procedure, and report the resulting models in GTF format....Pairwise comparison of tandem isoform usage can be performed using the “compare” keyword, which reports the relative usage of change points in each sample in a tabular format."

Hence, it sounds like the compare step outputs the identified change points (or PAS) that we want to report.

mrgazzara · 2022-09-28T14:14:03Z

See full comment on the PR 431 (#431 (comment)) where I describe why the compare step output is correct for grabbing PAS and suggest how to also get dPAS coordinates (the above only grabs the pPAS coordinates which are the changepoint)

faricazjj self-assigned this Sep 27, 2022

faricazjj changed the title ~~[Bug Fix] IsoSCM identification output should be from assemble step~~ [Bug Fix] IsoSCM identification output should be from compare step Sep 27, 2022

faricazjj mentioned this issue Sep 27, 2022

[Bug Fix] IsoSCM identification output should be from assemble step #431

Merged

8 tasks

faricazjj linked a pull request Sep 27, 2022 that will close this issue

[Bug Fix] IsoSCM identification output should be from assemble step #431

Merged

8 tasks

faricazjj closed this as completed in #431 Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Fix] IsoSCM identification output should be from compare step #429

[Bug Fix] IsoSCM identification output should be from compare step #429

faricazjj commented Sep 19, 2022 •

edited

Loading

mrgazzara commented Sep 28, 2022

[Bug Fix] IsoSCM identification output should be from compare step #429

[Bug Fix] IsoSCM identification output should be from compare step #429

Comments

faricazjj commented Sep 19, 2022 • edited Loading

mrgazzara commented Sep 28, 2022

faricazjj commented Sep 19, 2022 •

edited

Loading