You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we get results for identification challenge from one of the output files from the assemble step. Initially, this file seems to contain the identified changepoints just by looking at the content of the file.
But as I read further beyond the assemble step, the second step they describe is the compare command. The compare command "reports the differential usage of each identified change-point", which I expected to show the site usages of the same sites as the sites from the previous assemble step shown in the screenshot above. But I see less and also different sites in the compare step output file
This led me to look more into whether we should be extracting sites for identification challenge from the assemble or compare step.
As additional context, the steps to run isoscm are:
run assemble step which creates a tmp folder: isoscm/tmp/{sample}.cp.filtered.gtf
run compare step, this requires the xml file output from assemble step for two samples, but since we're getting site usage per sample, I put the same sample twice as input to obtain the following output:
Even though the first isoscm/tmp/{sample}.cp.filtered.gtf file contains changepoint locations, I'm not entirely sure we should obtain identification output sites from there since it's in a tmp folder and the github readme doesn't explain what the files in the tmp folders are--they only explained the files outside of the tmp folder from assemble step i.e they explained the files from step 2 above but not files from step 1. I think the sites for identification might have to be obtained from the compare step output (i.e. isoscm/compare/{sample}.txt). After reading their readme, I checked their paper and saw that in their paper, they didn't mention assemble step to be where we get identified changpoints. They mentioned from "...Using the “assemble” keyword IsoSCM will assemble the mapped reads in a BAM file into a splice graph, identify nested terminal exons boundaries using the constrained segmentation procedure, and report the resulting models in GTF format....Pairwise comparison of tandem isoform usage can be performed using the “compare” keyword, which reports the relative usage of change points in each sample in a tabular format."
Hence, it sounds like the compare step outputs the identified change points (or PAS) that we want to report.
The text was updated successfully, but these errors were encountered:
faricazjj
changed the title
[Bug Fix] IsoSCM identification output should be from assemble step
[Bug Fix] IsoSCM identification output should be from compare step
Sep 27, 2022
See full comment on the PR 431 (#431 (comment)) where I describe why the compare step output is correct for grabbing PAS and suggest how to also get dPAS coordinates (the above only grabs the pPAS coordinates which are the changepoint)
Currently, we get results for identification challenge from one of the output files from the assemble step. Initially, this file seems to contain the identified changepoints just by looking at the content of the file.
But as I read further beyond the assemble step, the second step they describe is the compare command. The compare command "reports the differential usage of each identified change-point", which I expected to show the site usages of the same sites as the sites from the previous assemble step shown in the screenshot above. But I see less and also different sites in the compare step output file
This led me to look more into whether we should be extracting sites for identification challenge from the assemble or compare step.
As additional context, the steps to run isoscm are:
Even though the first isoscm/tmp/{sample}.cp.filtered.gtf file contains changepoint locations, I'm not entirely sure we should obtain identification output sites from there since it's in a tmp folder and the github readme doesn't explain what the files in the tmp folders are--they only explained the files outside of the tmp folder from assemble step i.e they explained the files from step 2 above but not files from step 1. I think the sites for identification might have to be obtained from the compare step output (i.e. isoscm/compare/{sample}.txt). After reading their readme, I checked their paper and saw that in their paper, they didn't mention assemble step to be where we get identified changpoints. They mentioned from "...Using the “assemble” keyword IsoSCM will assemble the mapped reads in a BAM file into a splice graph, identify nested terminal exons boundaries using the constrained segmentation procedure, and report the resulting models in GTF format....Pairwise comparison of tandem isoform usage can be performed using the “compare” keyword, which reports the relative usage of change points in each sample in a tabular format."
Hence, it sounds like the compare step outputs the identified change points (or PAS) that we want to report.
The text was updated successfully, but these errors were encountered: