Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle reads with alternative adapters or primer sequences #744

Closed
fxfish opened this issue Nov 30, 2023 · 4 comments
Closed

How to handle reads with alternative adapters or primer sequences #744

fxfish opened this issue Nov 30, 2023 · 4 comments

Comments

@fxfish
Copy link

fxfish commented Nov 30, 2023

Hi,
It's a powerful tools for data analysis.
I have a question when trimming primers from paired-end reads of amplicon. The sequencing data (e.g. forward_reads) likes: AAATTTaaaaaaaaCCCGGG and GGGCCCtttttttttTTTAAA (mixed), I want all output data likes: aaaaaaaa, which means recognizing ‘GGGCCC...TTTAAA’ and reversing complementary sequence.
I must do three steps to achive this, one using -g ^AAATTT -G ^CCCGGG, and another using -g ^CCCGGG -G ^AAATTT, at last combining 1st R1.fq and 2st R2.fq into the last R1.fq.
Can I just use one step commond? -g -g or -g XXX|XXX
Thanks.

@marcelm
Copy link
Owner

marcelm commented Dec 4, 2023

This functionality doesn’t exist, yet, but I think this is actually issue #561, that is, it would be solved by implementing --revcomp for paired-end data. --revcomp already works for single-end reads, but I wasn’t sure whether it makes sense to implement it for paired-end reads.

It would work as follows. The idea is that you write -g ^AAATTT -G ^CCCGGG --revcomp. Then, for each read, this is done:

  1. Normal adapter search is done as if --revcomp had not been given
  2. R1 and R2 are swapped and the search is done again
  3. The results from both searches are compared: If the swapped version resulted in a better match, then the swapped version is output and the unswapped version otherwise.

Note that swapping R1 and R2 is equivalent to swapping the -g and -G options.

Can you say whether that would give you the result you want?

@fxfish
Copy link
Author

fxfish commented Dec 6, 2023

Think you very much. The function --revcomp does not work for paired-end reads. I can get the results I want by using cutadapt with several steps.

marcelm added a commit that referenced this issue Dec 6, 2023
marcelm added a commit that referenced this issue Dec 6, 2023
@marcelm marcelm closed this as completed in b4bc46e Dec 6, 2023
@marcelm
Copy link
Owner

marcelm commented Dec 6, 2023

I have implemented this now. Please update to Cutadapt 4.6. Then you can use --revcomp with paired-end reads.

(The release will be available on Bioconda in probably one or two hours.)

@fxfish
Copy link
Author

fxfish commented Dec 6, 2023

wow, nice, I will update it on conda, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants