Fix memory access violation in partition_with_ram_budget #654
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
What does this implement/fix? Briefly explain your changes.
Fixes a potential crash in
partition_with_ram_budget
function whenk_base
parameter exceeds the number of partitions created.Issue:
When the partitioning algorithm creates fewer partitions than the requested
k_base
value, the code attempts to access cluster centers that don't exist, leading to memory access violations and crashes.Example error (intermittent):
This occurs when
k_base=5
is passed but the k-means algorithm converges to only 3 clusters, causing out-of-bounds access in the cluster size estimation step. The issue is intermittent due to the non-deterministic nature of k-means clustering.Solution:
Added a bounds check to ensure
k_base
never exceeds the actual number of partitions:This prevents out-of-bounds access during cluster size estimation while preserving the intended functionality.
Any other comments?
This is a minimal, safe fix that maintains backward compatibility. The change ensures robust behavior when the partitioning algorithm determines that fewer partitions are needed than initially requested via
k_base
.