Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: more aggressive time scaling in sample_qc #297

Merged
merged 2 commits into from
Jun 4, 2024

Conversation

jaamarks
Copy link
Collaborator

@jaamarks jaamarks commented Jun 4, 2024

We implement two key improvements for resource allocation on rules for sample_qc sub_workflow.

  • Updated the mem_mb logic to scale memory allocation based on the input file size (input.size_mb).
  • Implemented a more aggressive scaling mechanism for wall-time allocation. This is so that when processing large samples (>90K), the workflow will more quickly complete successfully.

Fixes #282

This commit addresses the need for increased resource allocation,
particularly wall-time, for the concordance rules when processing large
samples.

- Scaling Time Allocation: We've implemented a more aggressive scaling
  mechanism for the time allocation in the BIG_TIME dictionary, which
  gets applied to the resource section of concordance rules. This
  ensures sufficient resources for processing large samples
  (e.g.,  exceeding 90K).

- Consistent application: We've also applied this updated scaling
  to the sample_concordance_king rule to ensure consistent resource
  allocation across all concordance rules. In particular, we added the
  more aggressive memory and time allocation to this rule.
Updates `mem_mb` logic to allocate memory based on input file size
(`input.size_mb`) for more efficient resource utilization.

This data-driven approach avoids over-allocation compared to the
previously used dictionary with fixed, potentially excessive values.

By dynamically adjusting memory based on input size, we ensure sufficient
resources for processing while preventing unnecessary memory usage when
wall-time is typically the bottleneck for these specific rules.
@jaamarks jaamarks changed the title Improve resource allocation performance: more aggressive time scaling in sample_qc subworkflow Jun 4, 2024
@jaamarks jaamarks changed the title performance: more aggressive time scaling in sample_qc subworkflow performance: more aggressive time scaling in sample_qc Jun 4, 2024
@jaamarks jaamarks force-pushed the issue-282-improve-time-allocation branch from 11e2d2a to f672fba Compare June 4, 2024 15:30
@jaamarks jaamarks merged commit 986271f into default Jun 4, 2024
2 checks passed
@jaamarks jaamarks deleted the issue-282-improve-time-allocation branch September 16, 2024 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v1.4.0 testing on 90k sample dataset: rule sample_concordance_king exceeds time limit
1 participant