[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29744

ankurdave · 2020-09-13T21:27:46Z

What changes were proposed in this pull request?

When BytesToBytesMap is at MAX_CAPACITY and reaches its growth threshold, numKeys >= growthThreshold is true but longArray.size() / 2 < MAX_CAPACITY is false. This correctly prevents the map from growing, but canGrowArray incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting. By this point the task has typically consumed all available memory, so the allocation of the new pointer array is likely to fail.

This PR fixes the issue by setting canGrowArray to false in this case. This prevents the map from accepting new elements when it cannot grow to accommodate them.

Why are the changes needed?

Without this change, hash aggregations will fail when the number of groups per task is greater than MAX_CAPACITY / 2 = 2^28 (approximately 268 million), and when the grouping aggregation is the only memory-consuming operator in its stage.

For example, the final aggregation in SELECT COUNT(DISTINCT id) FROM tbl fails when tbl contains 1 billion distinct values and when spark.sql.shuffle.partitions=1.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Reproducing this issue requires building a very large BytesToBytesMap. Because this is infeasible to do in a unit test, this PR was tested manually by adding the following test to AbstractBytesToBytesMapSuite. Before this PR, the test fails in 8.5 minutes. With this PR, the test passes in 1.5 minutes.

public abstract class AbstractBytesToBytesMapSuite {
  // ...
  @Test
  public void respectGrowthThresholdAtMaxCapacity() {
    TestMemoryManager memoryManager2 =
        new TestMemoryManager(
            new SparkConf()
            .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true)
            .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L)
            .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false)
            .set(package$.MODULE$.SHUFFLE_COMPRESS(), false));
    TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0);
    final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker
    final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes);

    try {
      // Insert keys into the map until it stops accepting new keys.
      for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY; i++) {
        if (i % (1024 * 1024) == 0) System.out.println("Inserting element " + i);
        final long[] value = new long[]{i};
        BytesToBytesMap.Location loc = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8);
        Assert.assertFalse(loc.isDefined());
        boolean success =
            loc.append(value, Platform.LONG_ARRAY_OFFSET, 8, value, Platform.LONG_ARRAY_OFFSET, 8);
        if (!success) break;
      }

      // The map should grow to its max capacity.
      long capacity = map.getArray().size() / 2;
      Assert.assertTrue(capacity == BytesToBytesMap.MAX_CAPACITY);

      // The map should stop accepting new keys once it has reached its growth
      // threshold, which is half the max capacity.
      Assert.assertTrue(map.numKeys() == BytesToBytesMap.MAX_CAPACITY / 2);

      map.free();
    } finally {
      map.free();
    }
  }
}

…eshold near MAX_CAPACITY

ankurdave · 2020-09-13T21:32:48Z

cc @viirya @cloud-fan @dongjoon-hyun

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

viirya · 2020-09-13T22:03:49Z

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

+        } else if (numKeys >= growthThreshold && longArray.size() / 2 >= MAX_CAPACITY) {
+          // The map cannot grow because doing so would cause it to exceed MAX_CAPACITY. Instead, we
+          // prevent the map from accepting any more new elements.
+          canGrowArray = false;


I think we won't exceed MAX_CAPACITY at all. We have some restrictions to prevent it.

The root cause, I think, is once the map reaches MAX_CAPACITY, we won't grow the map even the numKeys is more than growthThreshold. So we could add elements to the map to more than load factor (0.5) of it. The map cannot be used for sorting later in UnsafeKVExternalSorter. Although UnsafeKVExternalSorter will allocate new array in the case, it could hit OOM and fails the query.

This change favors reusing the array for sorting, but it prevents the map accepting new elements more early, half of the array is not used after the map capacity reaches MAX_CAPACITY. But by doing this, we will do spilling early too.

When I was fixing the customer application before, we don't see the OOM even the payload reached MAX_CAPACITY. Is it possibly due to the memory setting on your payload/cluster?

I agree. Once we reach MAX_CAPACITY, we have a choice: we can either (a) exceed the load factor, or (b) we can spill.

The code currently does (a). If the number of groups ultimately turns out to be less than MAX_CAPACITY, then this may avoid spilling. (Performance will degrade as the load factor approaches 1.0, but it will still be better than spilling.) However, if the number of groups is larger than MAX_CAPACITY, or if we need more memory to store the records, then we will be unable to spill and the query will fail.

It seems to me that it would be better to prevent queries from failing, as (b) does. This does mean we would spill earlier for a narrow range of queries.

Regarding the memory setting: I agree that whether or not an OOM will occur is dependent on the memory setting, but I would argue that (1) the user should not be required to tune memory settings to avoid failing queries, and (2) many reasonable settings will cause an OOM.

The problem arises when there are between 2^28+1 and 2^29 keys. If we need to spill, UnsafeKVExternalSorter will attempt to allocate between 8 and 16 GiB. Therefore the user will either need to increase the memory per task by approximately this amount so that this allocation will succeed, or decrease the memory per task sufficiently that the map will never reach MAX_CAPACITY in the first place. This seems to create quite a wide range of problematic settings.

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

SparkQA · 2020-09-14T00:13:06Z

Test build #128606 has finished for PR 29744 at commit de0f9fe.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-09-14T00:26:04Z

Test build #128605 has finished for PR 29744 at commit c3afb09.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-09-14T02:10:43Z

Test build #128607 has finished for PR 29744 at commit 7d76463.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-09-14T02:29:40Z

Test build #128608 has finished for PR 29744 at commit 00e4431.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-09-14T02:34:15Z

Test build #128609 has finished for PR 29744 at commit ec51e65.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java

dongjoon-hyun · 2020-09-14T05:16:16Z

Thank you for pinging me, @ankurdave . And, thank you for providing the test example.

dongjoon-hyun

+1, LGTM (except @cloud-fan 's comment).

SparkQA · 2020-09-14T10:25:54Z

Test build #128626 has finished for PR 29744 at commit f610fbe.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

mridulm · 2020-09-14T15:52:08Z

Sratch that. ~~Just curious - changing the guard condition at the beginning of append from numKeys == MAX_CAPACITY - 1 to numKeys >= MAX_CAPACITY - 1 would suffice ?~~

SparkQA · 2020-09-14T19:00:06Z

Test build #128661 has finished for PR 29744 at commit 680c4cf.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

ankurdave · 2020-09-14T19:15:15Z

Thanks all for the quick reviews! A couple questions for maintainers of core:

The Jenkins tests passed but the GitHub Actions tests seem flaky (50% success rate so far). Do we require both to pass? If so, is there a good way to rerun the GitHub Actions tests?
What is the feeling on backporting this fix? I don't feel strongly either way.

agrawaldevesh · 2020-09-14T19:38:15Z

@ankurdave ... Great fix !.

I am curious if the test you mentioned in the PR description should actually be checked in ? The only downside seems to be that it would take 1.5 minutes to pass. Perhaps there is a mode for "slow tests".

As for rerunning the GH tests: Unfortunately the only way currently is to reopen the PR: Close it in Github and reopen it -- that will force all the GH tests to be rerun.

dongjoon-hyun · 2020-09-14T19:51:29Z

@ankurdave

For this PR, one of them(Jenkins/GithubAction) is enough for merging.
For Guthub Action retrigger, all comitter can retrigger at Github action detail page
It would be great if we can have this at branch-3.0/2.4 because this bug has been existing since 2.0 or older.

viirya · 2020-09-14T19:52:51Z

As for rerunning the GH tests: Unfortunately the only way currently is to reopen the PR: Close it in Github and reopen it -- that will force all the GH tests to be rerun.

I think we can rerun GitHub Actions without closing the PR. Just triggered the rerun.

dongjoon-hyun · 2020-09-14T19:55:19Z

@agrawaldevesh . Not every CI has a large enough memory. It can lead us more flaky situation. You can add the test in your downstream CI environment.

dongjoon-hyun · 2020-09-14T20:57:18Z

Merged to master/3.0/2.4. Thank you, @ankurdave and all.

…eding growth threshold ### What changes were proposed in this pull request? When BytesToBytesMap is at `MAX_CAPACITY` and reaches its growth threshold, `numKeys >= growthThreshold` is true but `longArray.size() / 2 < MAX_CAPACITY` is false. This correctly prevents the map from growing, but `canGrowArray` incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting. By this point the task has typically consumed all available memory, so the allocation of the new pointer array is likely to fail. This PR fixes the issue by setting `canGrowArray` to false in this case. This prevents the map from accepting new elements when it cannot grow to accommodate them. ### Why are the changes needed? Without this change, hash aggregations will fail when the number of groups per task is greater than `MAX_CAPACITY / 2 = 2^28` (approximately 268 million), and when the grouping aggregation is the only memory-consuming operator in its stage. For example, the final aggregation in `SELECT COUNT(DISTINCT id) FROM tbl` fails when `tbl` contains 1 billion distinct values and when `spark.sql.shuffle.partitions=1`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Reproducing this issue requires building a very large BytesToBytesMap. Because this is infeasible to do in a unit test, this PR was tested manually by adding the following test to AbstractBytesToBytesMapSuite. Before this PR, the test fails in 8.5 minutes. With this PR, the test passes in 1.5 minutes. ```java public abstract class AbstractBytesToBytesMapSuite { // ... Test public void respectGrowthThresholdAtMaxCapacity() { TestMemoryManager memoryManager2 = new TestMemoryManager( new SparkConf() .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true) .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L) .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false) .set(package$.MODULE$.SHUFFLE_COMPRESS(), false)); TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0); final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes); try { // Insert keys into the map until it stops accepting new keys. for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY; i++) { if (i % (1024 * 1024) == 0) System.out.println("Inserting element " + i); final long[] value = new long[]{i}; BytesToBytesMap.Location loc = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8); Assert.assertFalse(loc.isDefined()); boolean success = loc.append(value, Platform.LONG_ARRAY_OFFSET, 8, value, Platform.LONG_ARRAY_OFFSET, 8); if (!success) break; } // The map should grow to its max capacity. long capacity = map.getArray().size() / 2; Assert.assertTrue(capacity == BytesToBytesMap.MAX_CAPACITY); // The map should stop accepting new keys once it has reached its growth // threshold, which is half the max capacity. Assert.assertTrue(map.numKeys() == BytesToBytesMap.MAX_CAPACITY / 2); map.free(); } finally { map.free(); } } } ``` Closes #29744 from ankurdave/SPARK-32872. Authored-by: Ankur Dave <ankurdave@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 72550c3) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

…eding growth threshold When BytesToBytesMap is at `MAX_CAPACITY` and reaches its growth threshold, `numKeys >= growthThreshold` is true but `longArray.size() / 2 < MAX_CAPACITY` is false. This correctly prevents the map from growing, but `canGrowArray` incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting. By this point the task has typically consumed all available memory, so the allocation of the new pointer array is likely to fail. This PR fixes the issue by setting `canGrowArray` to false in this case. This prevents the map from accepting new elements when it cannot grow to accommodate them. Without this change, hash aggregations will fail when the number of groups per task is greater than `MAX_CAPACITY / 2 = 2^28` (approximately 268 million), and when the grouping aggregation is the only memory-consuming operator in its stage. For example, the final aggregation in `SELECT COUNT(DISTINCT id) FROM tbl` fails when `tbl` contains 1 billion distinct values and when `spark.sql.shuffle.partitions=1`. No. Reproducing this issue requires building a very large BytesToBytesMap. Because this is infeasible to do in a unit test, this PR was tested manually by adding the following test to AbstractBytesToBytesMapSuite. Before this PR, the test fails in 8.5 minutes. With this PR, the test passes in 1.5 minutes. ```java public abstract class AbstractBytesToBytesMapSuite { // ... Test public void respectGrowthThresholdAtMaxCapacity() { TestMemoryManager memoryManager2 = new TestMemoryManager( new SparkConf() .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true) .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L) .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false) .set(package$.MODULE$.SHUFFLE_COMPRESS(), false)); TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0); final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes); try { // Insert keys into the map until it stops accepting new keys. for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY; i++) { if (i % (1024 * 1024) == 0) System.out.println("Inserting element " + i); final long[] value = new long[]{i}; BytesToBytesMap.Location loc = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8); Assert.assertFalse(loc.isDefined()); boolean success = loc.append(value, Platform.LONG_ARRAY_OFFSET, 8, value, Platform.LONG_ARRAY_OFFSET, 8); if (!success) break; } // The map should grow to its max capacity. long capacity = map.getArray().size() / 2; Assert.assertTrue(capacity == BytesToBytesMap.MAX_CAPACITY); // The map should stop accepting new keys once it has reached its growth // threshold, which is half the max capacity. Assert.assertTrue(map.numKeys() == BytesToBytesMap.MAX_CAPACITY / 2); map.free(); } finally { map.free(); } } } ``` Closes #29744 from ankurdave/SPARK-32872. Authored-by: Ankur Dave <ankurdave@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 72550c3) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun · 2020-09-14T21:25:11Z

Oops. @ankurdave . Sorry, but could you make a backporting PR to branch-2.4 once more?

… exceeding growth threshold When BytesToBytesMap is at `MAX_CAPACITY` and reaches its growth threshold, `numKeys >= growthThreshold` is true but `longArray.size() / 2 < MAX_CAPACITY` is false. This correctly prevents the map from growing, but `canGrowArray` incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting. By this point the task has typically consumed all available memory, so the allocation of the new pointer array is likely to fail. This PR fixes the issue by setting `canGrowArray` to false in this case. This prevents the map from accepting new elements when it cannot grow to accommodate them. Without this change, hash aggregations will fail when the number of groups per task is greater than `MAX_CAPACITY / 2 = 2^28` (approximately 268 million), and when the grouping aggregation is the only memory-consuming operator in its stage. For example, the final aggregation in `SELECT COUNT(DISTINCT id) FROM tbl` fails when `tbl` contains 1 billion distinct values and when `spark.sql.shuffle.partitions=1`. No. Reproducing this issue requires building a very large BytesToBytesMap. Because this is infeasible to do in a unit test, this PR was tested manually by adding the following test to AbstractBytesToBytesMapSuite. Before this PR, the test fails in 8.5 minutes. With this PR, the test passes in 1.5 minutes. ```java public abstract class AbstractBytesToBytesMapSuite { // ... Test public void respectGrowthThresholdAtMaxCapacity() { TestMemoryManager memoryManager2 = new TestMemoryManager( new SparkConf() .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true) .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L) .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false) .set(package$.MODULE$.SHUFFLE_COMPRESS(), false)); TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0); final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes); try { // Insert keys into the map until it stops accepting new keys. for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY; i++) { if (i % (1024 * 1024) == 0) System.out.println("Inserting element " + i); final long[] value = new long[]{i}; BytesToBytesMap.Location loc = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8); Assert.assertFalse(loc.isDefined()); boolean success = loc.append(value, Platform.LONG_ARRAY_OFFSET, 8, value, Platform.LONG_ARRAY_OFFSET, 8); if (!success) break; } // The map should grow to its max capacity. long capacity = map.getArray().size() / 2; Assert.assertTrue(capacity == BytesToBytesMap.MAX_CAPACITY); // The map should stop accepting new keys once it has reached its growth // threshold, which is half the max capacity. Assert.assertTrue(map.numKeys() == BytesToBytesMap.MAX_CAPACITY / 2); map.free(); } finally { map.free(); } } } ``` Closes apache#29744 from ankurdave/SPARK-32872. Authored-by: Ankur Dave <ankurdave@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

ankurdave · 2020-09-14T21:50:09Z

Opened #29753 to backport to branch-2.4.

…eding growth threshold ### What changes were proposed in this pull request? When BytesToBytesMap is at `MAX_CAPACITY` and reaches its growth threshold, `numKeys >= growthThreshold` is true but `longArray.size() / 2 < MAX_CAPACITY` is false. This correctly prevents the map from growing, but `canGrowArray` incorrectly remains true. Therefore the map keeps accepting new keys and exceeds its growth threshold. If we attempt to spill the map in this state, the UnsafeKVExternalSorter will not be able to reuse the long array for sorting. By this point the task has typically consumed all available memory, so the allocation of the new pointer array is likely to fail. This PR fixes the issue by setting `canGrowArray` to false in this case. This prevents the map from accepting new elements when it cannot grow to accommodate them. ### Why are the changes needed? Without this change, hash aggregations will fail when the number of groups per task is greater than `MAX_CAPACITY / 2 = 2^28` (approximately 268 million), and when the grouping aggregation is the only memory-consuming operator in its stage. For example, the final aggregation in `SELECT COUNT(DISTINCT id) FROM tbl` fails when `tbl` contains 1 billion distinct values and when `spark.sql.shuffle.partitions=1`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Reproducing this issue requires building a very large BytesToBytesMap. Because this is infeasible to do in a unit test, this PR was tested manually by adding the following test to AbstractBytesToBytesMapSuite. Before this PR, the test fails in 8.5 minutes. With this PR, the test passes in 1.5 minutes. ```java public abstract class AbstractBytesToBytesMapSuite { // ... Test public void respectGrowthThresholdAtMaxCapacity() { TestMemoryManager memoryManager2 = new TestMemoryManager( new SparkConf() .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true) .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L) .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false) .set(package$.MODULE$.SHUFFLE_COMPRESS(), false)); TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0); final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes); try { // Insert keys into the map until it stops accepting new keys. for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY; i++) { if (i % (1024 * 1024) == 0) System.out.println("Inserting element " + i); final long[] value = new long[]{i}; BytesToBytesMap.Location loc = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8); Assert.assertFalse(loc.isDefined()); boolean success = loc.append(value, Platform.LONG_ARRAY_OFFSET, 8, value, Platform.LONG_ARRAY_OFFSET, 8); if (!success) break; } // The map should grow to its max capacity. long capacity = map.getArray().size() / 2; Assert.assertTrue(capacity == BytesToBytesMap.MAX_CAPACITY); // The map should stop accepting new keys once it has reached its growth // threshold, which is half the max capacity. Assert.assertTrue(map.numKeys() == BytesToBytesMap.MAX_CAPACITY / 2); map.free(); } finally { map.free(); } } } ``` Closes apache#29744 from ankurdave/SPARK-32872. Authored-by: Ankur Dave <ankurdave@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 72550c3) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

[SPARK-32872][CORE] Prevent BytesToBytesMap from exceeding growth thr…

c3afb09

…eshold near MAX_CAPACITY

probot-autolabeler bot added the CORE label Sep 13, 2020

ankurdave mentioned this pull request Sep 13, 2020

[SPARK-30198][Core] BytesToBytesMap does not grow internal long array as expected #26828

Closed

viirya reviewed Sep 13, 2020

View reviewed changes

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java Outdated Show resolved Hide resolved

Reword comment based on review feedback

de0f9fe

ankurdave changed the title ~~[SPARK-32872][CORE] Prevent BytesToBytesMap from exceeding growth threshold near MAX_CAPACITY~~ [SPARK-32872][CORE] Prevent BytesToBytesMap from exceeding growth threshold at MAX_CAPACITY Sep 13, 2020

ankurdave changed the title ~~[SPARK-32872][CORE] Prevent BytesToBytesMap from exceeding growth threshold at MAX_CAPACITY~~ [SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold Sep 13, 2020

viirya reviewed Sep 13, 2020

View reviewed changes

Better comment wording

7d76463

viirya reviewed Sep 13, 2020

View reviewed changes

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java Outdated Show resolved Hide resolved

viirya approved these changes Sep 13, 2020

View reviewed changes

Adopt comment wording suggested by @viirya

ec51e65

ankurdave force-pushed the SPARK-32872 branch from 00e4431 to ec51e65 Compare September 13, 2020 23:39

cloud-fan reviewed Sep 14, 2020

View reviewed changes

core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java Outdated Show resolved Hide resolved

cloud-fan approved these changes Sep 14, 2020

View reviewed changes

dongjoon-hyun approved these changes Sep 14, 2020

View reviewed changes

Address review comment

f610fbe

Line length

680c4cf

dongjoon-hyun closed this in 72550c3 Sep 14, 2020

ankurdave mentioned this pull request Sep 14, 2020

[SPARK-32872][CORE][2.4] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29753

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29744

[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29744

ankurdave commented Sep 13, 2020 •

edited

Loading

ankurdave commented Sep 13, 2020

viirya Sep 13, 2020 •

edited

Loading

ankurdave Sep 13, 2020

ankurdave Sep 13, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

dongjoon-hyun left a comment

SparkQA commented Sep 14, 2020

mridulm commented Sep 14, 2020 •

edited

Loading

SparkQA commented Sep 14, 2020

ankurdave commented Sep 14, 2020

agrawaldevesh commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

viirya commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020 •

edited

Loading

dongjoon-hyun commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

ankurdave commented Sep 14, 2020

[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29744

[SPARK-32872][CORE] Prevent BytesToBytesMap at MAX_CAPACITY from exceeding growth threshold #29744

Conversation

ankurdave commented Sep 13, 2020 • edited Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

ankurdave commented Sep 13, 2020

viirya Sep 13, 2020 • edited Loading

Choose a reason for hiding this comment

ankurdave Sep 13, 2020

Choose a reason for hiding this comment

ankurdave Sep 13, 2020

Choose a reason for hiding this comment

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

SparkQA commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

dongjoon-hyun left a comment

Choose a reason for hiding this comment

SparkQA commented Sep 14, 2020

mridulm commented Sep 14, 2020 • edited Loading

SparkQA commented Sep 14, 2020

ankurdave commented Sep 14, 2020

agrawaldevesh commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

viirya commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020 • edited Loading

dongjoon-hyun commented Sep 14, 2020

dongjoon-hyun commented Sep 14, 2020

ankurdave commented Sep 14, 2020

ankurdave commented Sep 13, 2020 •

edited

Loading

viirya Sep 13, 2020 •

edited

Loading

mridulm commented Sep 14, 2020 •

edited

Loading

dongjoon-hyun commented Sep 14, 2020 •

edited

Loading