Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mate mismatch on multi-file mode #28

Closed
billytaj opened this issue Oct 16, 2018 · 6 comments
Closed

Mate mismatch on multi-file mode #28

billytaj opened this issue Oct 16, 2018 · 6 comments

Comments

@billytaj
Copy link

Hi, I ran into an issue using Adapterremoval on a dataset with a very weird situation:

  • AdapterRemoval was run on multi-file mode
    I was lead to believe that the paired-end mode would automatically assume that --file1 and --file2 were forward and reverse end reads like it did in all other cases, but it seems to be choking.
    What would cause something like this?

the data in question is a part of the NCBI SRA:
project: PRJNA389280
SRR5947944_1.fastq
SRR5947944_2.fastq

The read IDs are the same for each file, so the only way to check for forward-or-reverse was the filename, which has been fine up to now for all other data, except with this set.

@MikkelSchubert
Copy link
Owner

MikkelSchubert commented Oct 16, 2018 via email

@billytaj
Copy link
Author

Trimming paired end reads ...
Error in thread:
Pair contains reads with mismatching names:

  • 'SRR5947945.100791'
  • 'SRR5947945.101693'

Note that AdapterRemoval by determines the mate numbers as
the digit found at the end of the read name, if this is
preceeded by the character '/'; if these data makes use of a
different character to separate the mate number from the
read name, then you will need to set the --mate-separator
command-line option to the appropriate character.
Error in thread:
Pair contains reads with mismatching names:

  • 'SRR5947945.101400'
  • 'SRR5947945.100533'

Note that AdapterRemoval by determines the mate numbers as
the digit found at the end of the read name, if this is
preceeded by the character '/'; if these data makes use of a
different character to separate the mate number from the
read name, then you will need to set the --mate-separator
command-line option to the appropriate character.

I got this error message when using adapter-removal in multi input mode.

@MikkelSchubert
Copy link
Owner

MikkelSchubert commented Oct 16, 2018 via email

@billytaj
Copy link
Author

I am using --file1 and --file2, where each entry is an absolute path to the files (forward, and reverse)

@billytaj billytaj reopened this Oct 16, 2018
@MikkelSchubert
Copy link
Owner

MikkelSchubert commented Oct 16, 2018 via email

@billytaj
Copy link
Author

billytaj commented Oct 16, 2018

I produced the files using fasterq-dump. I also only take an inner-join of the 2 files, so the files would have the exact same read IDs
Please don't use the exact result as a guidepost. I received 2 messages of "error in thread. pair mismatches"

I solved the problem by appending the reads with a /1 and /2, and sending it off to AdapterRemoval, though I feel this is an unnecessary step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants