New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incorrectly detect non-extendible sequenced as adapter #2
Comments
The automatic adapter detection is the problem, because it is already adapter trimmed. If possible, I strongly advise to use untrimmed datasets for training. It shouldn't produce the out_of_range error and I'll have a look if I spot the reason for it, but without adapters you need to turn off the automatic detection by specifying decoy adapters: |
I could not reproduce the crash, but looking through the code I found a minor bug, that might in rare cases lead to this error. I pushed the bugfix to the devel branch. In case the crash persists, please reopen this issue providing a dataset that leads to it, so that I can reproduce and fix it. |
Hi there, |
ReSeq uses the adapters for two things (independent whether you specified them with So if you have adapter trimmed reads (which are not recommended but work) or large fragment length so that the adapter content is basically non-existent, you can specify whatever you want as long as it is not part of the reference. I suggested If you do have adapters and they are not detected by the automatic detection it is recommended to specify the correct ones. In this case I would be happy if you could share the dataset, so I could check why the automatic detection fails. Please let me know if you have further questions or something is unclear. |
Thank you very much for offering to take a look at the dataset. Here you can find the files with which I am having issues: https://www.dropbox.com/sh/7w5mpsxvcs4ukkg/AADroiOBFN3WiOYNbuROHUYla?dl=0 And this is the command used:
|
Looking at the read length distribution with Checking the unmapped reads for adapters (normally you observe adapters that ligate directly to each other) with the following command: Finally, I checked the read parts that exceed the mapped fragment (assuming the forward-reverse order of reads): This mean two things:
Finally a few remarks about your command: |
Since the reference file was missing in your dropbox, I could not run ReSeq on the data. Does it crash ( |
Thank you very much for examining the dataset. When I run Reseq I get the second error "Using only --vcfIn means you are not introducing any variants (except if you do something in your [REFBIASFILE]). For simulating variants you would need --vcfSim, but given that you also don't simulate any errors, no variants might be exactly what you want." - You are correct no simulated variants is exactly what I want in this case. I will rerun Reseq as you've specified and let you know if I run into any more issues. |
Quick question, neither of the files present in Reseq/adapters/ (TruSeq_single.fa and TruSeq_single.mat) are named simply |
Ok good, so it does not crash and exits as it is supposed to, when the automatic detection fails. |
Hi,
I am trying out reseq with one wgs sequencing data (truseq) and got the following error message (using suggested command in the Readme):
(1) the reads used to generate the bam file are already adapter-trimmed.
(2)
GCTCTTCCG
is just part of the reads, not adapter (as shown in the following screen shot)Any idea about cause this error message? Thank you!
Best
Chunyu
The text was updated successfully, but these errors were encountered: