Skip to content

make duplicate handling configurable #47

@sreichl

Description

@sreichl

regarding more regions: I checked the alignment step (https://github.com/epigen/atacseq_pipeline/blob/main/workflow/rules/processing.smk) and as you suspect we flag the duplicates but don't remove them. I can't recall a specific argument for/agains it. Therefore, I can not claim this was intentional, on the other hand, most of the commands are quite specific and intentional. As far as I understand and have read up a little there are pro and cons associated with duplicate removal (pro: technical bias removed, con: potential biology removed; the usual tradeoff). My hope is that signals still remain the same, especially with many samples, filtering and normalization.

MACS2 automatically removes duplicates (with some threshold for max duplication). The duplicates macs2 removes is consistent with those marked by samblaster.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions