Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-46401][CORE] Use !isEmpty() on RoaringBitmap instead of getCardinality() > 0 in RemoteBlockPushResolver #44347

Closed
wants to merge 1 commit into from

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Dec 14, 2023

What changes were proposed in this pull request?

This pr use !mapTracker.isEmpty() instead of mapTracker.getCardinality() > 0 in RemoteBlockPushResolver to reduce some computational costs, as the getCardinality() method triggers a recount every time it is called.

  /**
   * Returns the number of distinct integers added to the bitmap (e.g., number of bits set).
   *
   * @return the cardinality
   */
  @Override
  public long getLongCardinality() {
    long size = 0;
    for (int i = 0; i < this.highLowContainer.size(); i++) {
      size += this.highLowContainer.getContainerAtIndex(i).getCardinality();
    }
    return size;
  }

  @Override
  public int getCardinality() {
    return (int) getLongCardinality();
  }
 /**
   * Checks whether the bitmap is empty.
   *
   * @return true if this bitmap contains no set bit
   */
  @Override
  public boolean isEmpty() {
    return highLowContainer.size() == 0;
  }

This change is refer to RoaringBitmap/RoaringBitmap#239 | RoaringBitmap/RoaringBitmap#684

Why are the changes needed?

Use a more appropriate API to reduce the computational cost of empty-checking for RoaringBitmap.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Pass GitHub Actions

Was this patch authored or co-authored using generative AI tooling?

No

@LuciferYang LuciferYang marked this pull request as draft December 14, 2023 04:58
@github-actions github-actions bot added the CORE label Dec 14, 2023
@LuciferYang LuciferYang changed the title [SPARK-46401][CORE] Use !isEmpty() on RoaringBitmap instead of getCardinality() > 0 [SPARK-46401][CORE] Use !isEmpty() on RoaringBitmap instead of getCardinality() > 0 in RemoteBlockPushResolver Dec 14, 2023
@@ -809,7 +809,7 @@ public MergeStatuses finalizeShuffleMerge(FinalizeShuffleMerge msg) {
msg.shuffleMergeId, partition.reduceId);
// This can throw IOException which will marks this shuffle partition as not merged.
partition.finalizePartition();
if (partition.mapTracker.getCardinality() > 0) {
if (!partition.mapTracker.isEmpty()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should only be this one place where getCardinality() > 0 is used for empty-checking on RoaringBitmap

@LuciferYang LuciferYang changed the title [SPARK-46401][CORE] Use !isEmpty() on RoaringBitmap instead of getCardinality() > 0 in RemoteBlockPushResolver [SPARK-46401][CORE] Use !isEmpty() on RoaringBitmap instead of getCardinality() > 0 in RemoteBlockPushResolver Dec 14, 2023
Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch @LuciferYang !
Looks good to me.

Why is this still DRAFT ? :-)

@mridulm
Copy link
Contributor

mridulm commented Dec 14, 2023

+CC @otterc as well

@LuciferYang LuciferYang marked this pull request as ready for review December 14, 2023 06:08
@LuciferYang
Copy link
Contributor Author

LuciferYang commented Dec 14, 2023

Nice catch @LuciferYang ! Looks good to me.

Why is this still DRAFT ? :-)

Thanks @mridulm ~
Set to ready to review :)

Copy link
Member

@yaooqinn yaooqinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@otterc otterc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change. LGTM

@LuciferYang
Copy link
Contributor Author

Merged into master for Spark 4.0. Thanks @mridulm @yaooqinn @otterc ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants