Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error! not matched in input files #21

Closed
sroener opened this issue Jun 16, 2021 · 2 comments
Closed

Error! not matched in input files #21

sroener opened this issue Jun 16, 2021 · 2 comments

Comments

@sroener
Copy link

sroener commented Jun 16, 2021

Hi,

I am working on reprocessing some samples and I want to use NGmerge to properly merge the PE reads. For this I convert an existing .bam file to fastQ files and use them as input for NGmerge. I execute the program like this:

NGmerge -w resources/qual_profile.txt -u 41 -n 8 -z -1 FILE_R1.fastq.gz -2 FILE_R2.fastq.gz -o FILE_merged.fastq.gz -f FILE_nonmerged -l FILE.log

For most samples, everything works like a charm, but for some I get errors like this:
Error! @HISEQ_172:2:2211:1315:83788 BC:Z:NAGCGTTANGAGTCAA: not matched in input files

Any idea what the problem might be?

@jsh58
Copy link
Owner

jsh58 commented Jun 17, 2021

This error occurs when the paired reads' headers do not match. As the README states:

The input files must list the reads in the same order. The program requires that the paired reads' headers match, at least up to the first space character (or whatever alternative character is specified by -t).

@sroener
Copy link
Author

sroener commented Jun 17, 2021

Thx for the fast reply and pointing out when the error occurs.

Apparently, I had the misconceptions that these reads would also be written to the file specified with the -f flag?

Interestingly, if I look at the _R1 and _R2 files and search for the read headers mentioned in the error message, I find identical headers in both files.

As the log files are showing what I would identify as normal behaviour of NGmerge and that it takes a while for the error to occur, I would assume that there are only a few reads with non matching headers. Any suggestion on how to further investigate this?

Edit:
Additional filtering for singletons reads solved the problem. Still confused why some of the error IDs are present in both R1 and R2 and can be found by grep.

@sroener sroener closed this as completed Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants