fix: simpler three prime QuantSeq cutadapt setup #78

dlaehnemann · 2023-08-17T15:21:09Z

I have tested this on actual QuantSeq data, and the cutadapt setup in this PR yields mostly the same results that the complicated previous three step setup recommended by Lexogen gave. It seems to be a bit more aggressive in the poly-A tail removal, but this should not affect results much. And it greatly simplifies the workflow setup.

…p in the config.yaml for the 3' test

…ly-a` flag available

manuelphilip

I have a doubt regarding the cutadapt step that I mentioned in the corresponding section.

workflow/rules/trim.smk

config/README.md

…ain branch

dlaehnemann · 2023-09-14T11:12:25Z

We have now tested this on some current QuantSeq data, and the results seem to add up, even though there aren't any specific --poly-a statistics in the cutadapt log file. But using cutadapt 4.4 with Python 3.10.12 with these new recommended settings (using the standard cutadapt workflow rule):

--cores 8 -a r1adapter=AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC;min_overlap=7;max_error_rate=0.005 --minimum-length 33 --nextseq-trim=20 --poly-a

We in this instance get:

=== Summary ===

Total reads processed:              11,901,529
Reads with adapters:                 1,680,520 (14.1%)

== Read fate breakdown ==
Reads that were too short:             461,240 (3.9%)
Reads written (passing filters):    11,440,289 (96.1%)

Total basepairs processed: 1,202,054,429 bp
Quality-trimmed:              10,535,736 bp (0.9%)
Total written (filtered):  1,035,645,758 bp (86.2%)

Even when accounting for the adapter trimmed basepairs recorded in the log, this leaves us with a gap of basepairs unaccounted for of:

Total basepairs processed:  1,202,054,429 bp
Quality-trimmed:          -    10,535,736 bp
adapter trimmed:          -    48,066,270 bp
Total written (filtered): - 1,035,645,758 bp (86.2%)
============================================
--poly-a + too short          107,806,665 bp (8,97%)

As noted, these have to be basepairs removed by removing reads that are too short, plus possibly the --poly-a option. So I guess these numbers are OK and we can go ahead and merge this PR.

🤖 I have created a release *beep* *boop* --- ## [2.5.2](v2.5.1...v2.5.2) (2023-09-14) ### Bug Fixes * simpler three prime QuantSeq cutadapt setup ([#78](#78)) ([ecc9ab7](ecc9ab7)) * update samtools.yaml to latest `1.17` and update github actions ([#75](#75)) ([0fe7948](0fe7948)) ### Performance Improvements * bump datavzrd wrapper to 2.6.0 and general bug fixes ([#80](#80)) ([657c465](657c465)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

dlaehnemann added 8 commits August 17, 2023 14:56

clean up config/README.md

e21ddbc

fix: replace 3' specific cutadapt rules with a specific cutadapt setu…

8e8a290

…p in the config.yaml for the 3' test

fix: amend config/README.md to document QuantSeq setup.

d02c057

snakefmt

84244d1

correct config/README.md header levels

3f6a33c

remove import of deleted trim_3prime.smk from Snakefile

71b8743

fix quoting of cutadapt command-line arguments

0fcd88b

update to latest cutadapt wrapper to get cutadapt v4.4 with `--po…

0cbe87f

…ly-a` flag available

manuelphilip reviewed Aug 31, 2023

View reviewed changes

workflow/rules/trim.smk Show resolved Hide resolved

dlaehnemann commented Aug 31, 2023

View reviewed changes

config/README.md Show resolved Hide resolved

confirm deletion of workflow/rules/trim_3prime.smk while merging in m…

295eb43

…ain branch

manuelphilip approved these changes Sep 7, 2023

View reviewed changes

dlaehnemann merged commit ecc9ab7 into main Sep 14, 2023
6 checks passed

dlaehnemann deleted the fix-simpler-three-prime-cutadapt branch September 14, 2023 11:12

github-actions bot mentioned this pull request Sep 1, 2023

chore(main): release 2.5.2 #76

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: simpler three prime QuantSeq cutadapt setup #78

fix: simpler three prime QuantSeq cutadapt setup #78

dlaehnemann commented Aug 17, 2023

manuelphilip left a comment

dlaehnemann commented Sep 14, 2023 •

edited

fix: simpler three prime QuantSeq cutadapt setup #78

fix: simpler three prime QuantSeq cutadapt setup #78

Conversation

dlaehnemann commented Aug 17, 2023

manuelphilip left a comment

Choose a reason for hiding this comment

dlaehnemann commented Sep 14, 2023 • edited

dlaehnemann commented Sep 14, 2023 •

edited