Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk specific runtime artefacts in clusterrmdup #229

Closed
pontushojer opened this issue Sep 1, 2020 · 3 comments · Fixed by AfshinLab/BLR#39
Closed

Chunk specific runtime artefacts in clusterrmdup #229

pontushojer opened this issue Sep 1, 2020 · 3 comments · Fixed by AfshinLab/BLR#39

Comments

@pontushojer
Copy link
Collaborator

The runtime for the clusterrmdup (find_clusterdups after PR AfshinLab/BLR#30 merges) step is much longer for chrY than other chromosomes. See the following graphs below generated based on data in /proj/uppstore2018173/private/pontus/runs/200819._synchronise-merges_rerun. I have also seen this phenomenon in other runs.

Screenshot 2020-09-01 at 14 39 12

Runtime vs mean coverage for each chromosome. For chrM this is the sum of all small contigs that make up this "chunk".

Screenshot 2020-09-01 at 14 39 30

Runtime vs total contig length for each chromosome. For chrM this is the sum of all small contigs that make up this "chunk".

From these figures it is clear that chrY for some reason take longer that should be predicted based on coverage and contig length. Chr16 also somewhat breaks this pattern.

What could be the reason for this??

@marcelm
Copy link
Collaborator

marcelm commented Sep 1, 2020

chr1 also doesn’t follow the pattern. Or is this an artifact?

@pontushojer
Copy link
Collaborator Author

Yeah its true chr1 also takes longer than expected. It was however not as striking as chrY but still this might relate to a shared issue.

@pontushojer
Copy link
Collaborator Author

I did a check for a separate dataset and compared against all other rules that are run for the chunks. Looking at this we see that chr1, chr16, chr21 (somewhat less though) and chrY stick out from the rest (see the red trace for clusterrmdup).

Screenshot 2020-09-03 at 17 50 42

@pontushojer pontushojer changed the title chrY takes long to process in step clusterrmdup Chunk specific runtime artefacts in clusterrmdup Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants