Skip to content

Marking, removing, or keeping duplicates #47

@gh-byounggreenwald

Description

@gh-byounggreenwald

Hello again --

I have been reading through the deepSomatic paper again and don't see any mention of marking or discarding duplicates. What is the recommended method for handling duplicates in deepSomatic? If we mark them is this annotation leveraged? If we drop them are we hamstringing the algorithm because it was trained with duplicates present and thus its learned to leverage them in certain ways via its CNN?

I would assume the same thing is done as in deepVariant where they are excluded (and make little difference) google/deepvariant#384 but wanted to confirm as the paper didn't mention best practices.

Thanks :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions