Filter FASTQ by length - report #35

hoelzer · 2021-01-26T11:42:32Z

As an initial step, FASTQ files are filtered by length and if the file size is too small, the FASTQ is not processed any further. It would be good to have this somehow reported.

E.g. I just tested a FAST5 run (V3 primers) that resulted in 24 barcoded FASTQ and it seems 9 of them were sorted out and not processed any further.

It would be good to have a TSV with e.g. all IDs and a column that states which were sorted out due to low number of reads after filtering.

replikation · 2021-01-26T15:41:11Z

the size is so small for "removal" it should usually only remove barcodes that were falsely assigned via "guppy demultiplex" with the "one barcode only" option. so not sure if this would confuse more?

hoelzer · 2021-01-26T15:44:55Z

okay I see, let me do some checks and maybe you are right and we don't need this. It might be just confuse as well when people used 10 barcodes but only get 9 consensuses out (e.g. bc/ one barcode did not work well and produced only a handful reads).

But in such a case, one can also go back to the pycoQC and check the assigned barcode distribution, ...

hoelzer · 2021-01-26T15:45:33Z

I set this to wontfix for now and will check some more data to decide if we necessarily need such an output table

hoelzer added the enhancement New feature or request label Jan 26, 2021

hoelzer added the wontfix This will not be worked on label Jan 26, 2021

replikation closed this as completed Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter FASTQ by length - report #35

Filter FASTQ by length - report #35

hoelzer commented Jan 26, 2021

replikation commented Jan 26, 2021

hoelzer commented Jan 26, 2021

hoelzer commented Jan 26, 2021

Filter FASTQ by length - report #35

Filter FASTQ by length - report #35

Comments

hoelzer commented Jan 26, 2021

replikation commented Jan 26, 2021

hoelzer commented Jan 26, 2021

hoelzer commented Jan 26, 2021