Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

potential output problem #126

Closed
retrogenomics opened this issue May 27, 2015 · 5 comments
Closed

potential output problem #126

retrogenomics opened this issue May 27, 2015 · 5 comments

Comments

@retrogenomics
Copy link

Hi,
Using cutadapt like this (1.9dev1):
cutadapt -e 0.12 -m 25 --info-file="info.tab" --too-short-output="discarded_tooshort.fastq" --untrimmed-output="discarded_missing_adapter.fastq" -g $linker "trimmed_bc.fastq" -o "trimmed_linker.fastq" > "info.log"

gives me this output:
=== Summary ===

Total reads processed: 5,022
Reads with adapters: 4,999 (99.5%)
Reads that were too short: 392 (7.8%)
Reads written (passing filters): 4,630 (92.2%)

Total basepairs processed: 908,639 bp
Total written (filtered): 693,472 bp (76.3%)

However, when I count the number of reads in "trimmed_linker.fastq", I actually get 4,607 (4,999-392). What are these extra 23 reads counted in the 'Reads written (passing filters)' ?

@marcelm
Copy link
Owner

marcelm commented May 27, 2015

The 23 reads are the (5022-4999) untrimmed ones. Since they are also written to an output file (--untrimmed-output), they are also counted under the "Reads written (passing filters)" heading.

I agree this is confusing and should be changed, in particular because the reads in --too-short-output are not counted here. Perhaps my recent simplification of the statistics summary simplified things a bit too much. I’ll try to change this soon. Until then, you can calculate "Reads written (passing filters)" minus ("Total reads processed" minus "Reads with adapters") to get the number of reads written to the output file.

@retrogenomics
Copy link
Author

Actually, I still get the same 4630 number of reads, whether I use the --untrimmed-output (or the --too-short-output) option, or not.

@marcelm
Copy link
Owner

marcelm commented May 28, 2015

If you don’t use --untrimmed-output, then both trimmed and untrimmed reads are written to the regular output file, so the number does not change (you should now have 4630 reads in trimmed_linker.fastq). The number should change when you use --discard-untrimmed.

@retrogenomics
Copy link
Author

yes indeed ! Thank you for these explanations.

@marcelm
Copy link
Owner

marcelm commented May 28, 2015

I’ve opened issue #128 to track this. Thanks for reporting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants