-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-anchored linked adapters #224
Comments
to
The next version of cutadapt will allow to anchor the 3'-half of a linked adapter by appending a |
That's good, but creates an inconsistency in the user experience. For the 5' adapter, it is anchored without any special characters because;
Would a better default be to have unanchored adapters by default, unless the user specifies Also, have you considered when both of the adapters are not anchored to the beginning or end of a read and still having the ability to filter by both adapters being matched? I notice that |
I cannot change the default because the program needs to be backwards compatible. Linked adapters were added to cutadapt in version 1.10 already and some people surely rely on the current behavior. Regarding the inconsistency, there is a secret plan behind it: I intended to move away from I have to admit I still don’t know why non-anchored 5' adapters appear at all. I also have no experience with CRISPR data. Perhaps you can write a few sentences about how CRISPR reads looks like? For example, when you want a non-anchored 5' adapter, is it because the 5' adapter has some bases preceding it, or is it because it may appear in a degraded form? |
Here is an idea that is backwards compatible: I could add the syntax |
I like that idea. The reason both can be unancheroed, is because they're not really adapters but some complicated plasmid construct. The basic structure of a CRISPR GeCKO library short read is So, the flanking sequences (i.e. adapters in So, you might get a lot more of these requests next year. It is not the purpose that |
Thanks for the explanation! I need a handful of reads from a real dataset for testing. Do you know of a suitable (publicly available) dataset or would you be willing yourself to "donate" a few reads? They would become part of cutadapt’s test suite so it must be ok for them to become public. |
I have sent the reads by e-mail. |
I’ve implemented this now. The syntax is as described above ( This will be part of cutadapt 1.13, but it would be great if you could test it before the release. Let me know if you don’t know how to install from the Git repo. I’ve also marked the feature as 'tentative', which means that I may change it in a backwards-incompatible way in 1.14. But after that, it will be backwards compatible. Let me know what you think! P.S: I just saw that the statistics are a bit weird. I’ll fix that later. |
Yes, it works as expected on a complete input file. I don't mind updating my scripts for software interface changes. |
But, when I also use |
I see. This is not caused be There is no way to change this at the command-line level, yet. One possibility to deal with this - until I have implemented a solution - is to run cutadapt twice: In the first run, use |
Let’s continue discussion in #256 since the original issue here - as I understood it originally - was to implement non-anchored linked adapters, which has been done. |
I have CRISPR reads where there is a 5' promoter sequence and a 3' plasmid sequence. I was interested in the
--untrimmed-output
option, but wasn't sure how it works exactly from the information in the user guide. I want to retain only reads which had both 5' and 3' trimming done, not one flanking sequence but not the other . If both-g
and-a
are provided, what does--untrimmed-output
really mean? Could there also be a way for-a
to enforce that both adapters must be found if the linked adapter pattern is used? Currently, "... the 3’ adapter is optional".The text was updated successfully, but these errors were encountered: