Analyze & Fix Errors due to tez config changes #21

tanishq-chugh · 2024-07-25T08:05:32Z

groupby3_map_multi_distinct Analysis:

We had increased the tez container size from 128 to 256mb to address OOM errors. Now this qtest has a property - set hive.map.aggr=true; . If this property is set to true, a background check runs first named - checkMapSideAggregation(), to verify that there is enough space available to store the hash table that would be required in order to do this aggregation. The allotted space for this aggregation is half of container size and with half of 128mb, it was not enough to store this generated table, but with half of 256mb, now it is sufficient to store this table and hence map side aggregation happens. With this aggregation, the hashes for only these 307 distinct rows out of 500 rows are generated and stored and duplicate rows are mapped to this hashes. Thus, the change in statistics which is expected.

groupBy_3_map_multi_distinct_proof_128_vs_256

mm_all Analysis:
We had increased the tez container size from 128 to 256mb to address OOM errors. Now, total memory allocated to LLAP daemon is 4096mb and with each container size increased to 256 mb, available slots = 4096/256 = 16
With increased container size, split size increases and thus each task have higher resources. Due to this, each task computes larger number of rows and corrresponds to one hive side file each. The amount of data processed remains the same, just the amount of data processed by each task increases. Thus only 16 hive files are generated.

mm_dp Analysis:
The error in this test case arised only because of difference in the random numbers generated. The random number generation not only depends on the seed value passed but also on the available task resources. As above, the task resources have increased and each task processes higher number of rows, generating higher number of random numbers, the random numbers generated are different bw container sizes of 128 and 256.

Analyze & Fix Errors due to tez config changes

39c4c1c

tanishq-chugh merged commit 36f8f07 into jdk-17 Jul 25, 2024
7 checks passed

akshat0395 pushed a commit that referenced this pull request Aug 7, 2024

Analyze & Fix Errors due to tez config changes (#21)

dc99fe5

akshat0395 pushed a commit that referenced this pull request Aug 8, 2024

Analyze & Fix Errors due to tez config changes (#21)

5757eb7

tanishq-chugh added a commit that referenced this pull request Aug 22, 2024

Analyze & Fix Errors due to tez config changes (#21)

c9d22c3

kokila-19 pushed a commit that referenced this pull request Aug 28, 2024

Analyze & Fix Errors due to tez config changes (#21)

bc5beef

kokila-19 pushed a commit that referenced this pull request Sep 17, 2024

Analyze & Fix Errors due to tez config changes (#21)

aa002c7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyze & Fix Errors due to tez config changes #21

Analyze & Fix Errors due to tez config changes #21

tanishq-chugh commented Jul 25, 2024

Analyze & Fix Errors due to tez config changes #21

Analyze & Fix Errors due to tez config changes #21

Conversation

tanishq-chugh commented Jul 25, 2024