-
Notifications
You must be signed in to change notification settings - Fork 332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read names do not match with dual UMIs #46
Comments
To keep read names identical for R1/R2 #46
I just update the behaviour of UMI preprocessing for per_index and per_read mode.
Could you please try to build fastp with latest code on master. Or download http://opengene.org/fastp/fastp to test. |
Thanks for the fast response! I'll hit it on Monday and let you know. Sorry for the poor spelling in the name of the bug, that's a bit embarrassing =P Thanks again for the wonderful tool, loving it so far =) |
Any update? |
Tried out If so, I'm still just getting the UMI from each read put on that read, not shared across, eg umi1_umi2 Used version 0.12.6 - the one at http://opengene.org/fastp/fastp |
can you paste some reads here? |
Sure! Here are the input reads, the output reads, and what I was hoping to see:
|
As an example, I was hoping that first forward read would have come out with the UMI of both itself and the reverse read delimited in some way:
That way the read name of the forward and the reverse read would be the same (except for the 1:N:0 part) and BWA would still stake it. |
Could you please confirm you used the latest Seems like your result was obtain with old version of With
|
The work up above should have been 0.12.6 Sure, downloaded again:
Yup, output looks mostly as you described. Thank you! Checked a handful of reads, here is the output. Of the first four read pairs, two pairs gave output. Were the others just low quality? This sample did lose a great many of the reads due to that...
|
Ah, yes, quality, if I turned off quality filtering they all come back. Looks like this issue is closed, thank you again!
|
Thank you for making your wonderful tool!
For dual-UMI experiments, there may/should be different UMI tags on the forward and reverse read of a pair. Is there an option (now or in development) to remove the UMI tags from each read and place them on both of the resultant reads? Downstream tools require that the read names be the same so if there are different UMI tags on the forward and reverse of a pair, it will fail. Instead it should have the read name, followed by a delimiter between the forward and reverse UMI tags.
For instance, in fastq_1.fq.gz
read_1_name:etc:etc:etc:etc:etc:etc:read_1_tagread_2_tag
And in the pair, fastq_2.fq.gz
read_2_name:etc:etc:etc:etc:etc:etc:read_1_tagread_2_tag
The text was updated successfully, but these errors were encountered: