-
Notifications
You must be signed in to change notification settings - Fork 673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Salmon strand inference is often wrong #1185
Comments
Hi there, I'm seeing behaviour, which could be similar to your issue, in data I received from several collaborators. The logs show for example: When I look in detail at the
Results like these are not covered by the examples in the rseqc docs on infer_experiment.py. I was wondering:
Regards |
Hi, thank you for posting these questions! I would love to hear from either of you, if you have any further thoughts on the issue! I too am getting similar warnings across the board. In all cases, salmon marks the experiment samples as "reverse" and infer_experiment.py identifies something like 28% reverse and 75% undetermined. Do you think I should be concerned, or proceed ahead trusting the salmon identification? Thank you! |
The reason this comes up is because the auto strand setting comes from Salmon based on its pseudo-alignment against transcript sequences, while the final strandedness check is based on genomic alignments and RSeQC's assessment. The main source of the discrepancy is the reads of undetermined strand in RSeQC which play a part in the the assessment the pipeline makes bases on those statistics, and (possibly) shouldn't. I've opened the above PR to discuss and/ or address this. |
Tackled in #1306 |
Description of feature
Hi,
Thanks for all the work on this pipeline.
I have had to analyse several public datasets, and I plan to analyse more. Since the strandedness on such datasets is usually not provided, I use the
"strandedness: auto"
option in the pipeline to guess it.Quite often (apologies for not having statistics about that, I could try to get some if needed) I get
"WARNING: Fail Strand Check"
messages, and I find that Salmon had set the strandedness to "reverse" wheninfer_experiment.py
founds it to be "unstranded".When this happens, I set the strandedness to "unstranded" and rerun the pipeline.
Would it make sense for the pipeline to just reset the strandedness and rerun automatically?
Thanks again
The text was updated successfully, but these errors were encountered: