Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation with structural type causes GC pressure and excessive cross region reference #9553

Closed
wenleix opened this issue Dec 14, 2017 · 6 comments · Fixed by #9721
Closed
Labels

Comments

@wenleix
Copy link
Contributor

wenleix commented Dec 14, 2017

We have seen the following two issues in production when performing aggregations with structural type with huge number of groups. There are two separate cases:

  • We see a lot of SingleMapBlock when the aggregation function is like MAX_BY(map, x) or ARBITRARY(map), since an SingleMapBlock will be used for each group state.

    This issue can also happen for SingleRowBlock (for row type) or XArrayBlock (for array type). We have also seen a massive number of Slice in heap for MAX(varchar).

  • We see a lot of MapBlockBuilder for ARRAY_AGG(map).

    In general, this doesn't restricts to map type, since we could see a lot of BlockBuilder for ARRAY_AGG(x) on any type, as long as the aggregation produces many groups.

@wenleix wenleix changed the title Aggregation with structural type cause GC pressure Aggregation with structural type causes GC pressure Dec 14, 2017
@wenleix
Copy link
Contributor Author

wenleix commented Dec 27, 2017

For array_agg, one idea is store one huge flattened ArrayBlockBuilder for all groups, and we don't have per group ArrayBlockBuilder. We need to have a linked list for each group to construct the aggregated array per group. This is similar to the histogram aggregation improvement in #8918.

This idea should also be applicable to the following aggregation functions: map_agg, map_unoin and multimap_agg

@wenleix
Copy link
Contributor Author

wenleix commented Dec 27, 2017

Similarly, for ARBITRARY , one idea is to store a huge flattened ArrayBlockBuilder for all groups. Another integer array needs to be maintained for the mapping from group id to the position in the ArrayBlockBuilder. Since for ARBITRARY, element only doesn't need to be updated per group, we don't need to handle update case.

While MIN_BY and MAX_BY can also use this method to store the flattened array for all groups, they need to handle updates for group. One way would be to always append to the huge flattened ArrayBlockBuilder and update the mapping from group id to position. However, this requires periodic compaction when the ArrayBlockBuilder wastes too much memory.

@wenleix
Copy link
Contributor Author

wenleix commented Jan 9, 2018

Per discussion with @highker and @haozhun . There is an easier solution for ARBITRARY, MIN_BY, MAX_BY.

The issue is currently we use BlockState to store the aggregation state, which stores a Block for a structural type. The SingleMapBlock are crated via type.getObject(block, position)

public static void input(Type type, BlockState state, Block block, int position)
{
if (state.getBlock() != null) {
return;
}
state.setBlock((Block) type.getObject(block, position));
}

Instead of slice the large block into SingleMapBlock, we should store the parent block and the position into BlockState. The same applies to Slice as well.

In a long term, we might want to be able to perform automatic compaction.

@wenleix
Copy link
Contributor Author

wenleix commented Feb 13, 2018

Update: We have been able to reproduce the issue by a query contains a lot of array_agg and billions of groups. The symptom is we see excessive native memory usage (up to 20G). We find this is due to Remember Set with Native Memory Tracking enabled:

[0x00007f6340c1ea05] ArrayAllocator<unsigned long, (MemoryType)7>::allocate(unsigned long)+0x175
[0x00007f6340c1d2eb] BitMap::resize(unsigned long, bool)+0x6b
[0x00007f6340f183aa] OtherRegionsTable::add_reference(void*, int)+0x3ea
[0x00007f63411f1416] ObjArrayKlass::oop_oop_iterate_nv_m(oopDesc*, FilterOutOfRegionClosure*, MemRegion)+0x186
                             (malloc=7045856KB +7045856KB #880732 +880732)


[0x00007f6340c1ea05] ArrayAllocator<unsigned long, (MemoryType)7>::allocate(unsigned long)+0x175
[0x00007f6340c1d2eb] BitMap::resize(unsigned long, bool)+0x6b
[0x00007f6340f183aa] OtherRegionsTable::add_reference(void*, int)+0x3ea
[0x00007f6340f357ff] InstanceKlass::oop_oop_iterate_nv(oopDesc*, FilterOutOfRegionClosure*)+0x12f
                             (malloc=10433488KB +10433488KB #1304186 +1304186)

@nezihyigitbasi and @highker helped to find mitigations by tuning G1RSetSparseRegionEntries, as illustrated in https://medium.com/@milan.mimica/everybody-leaks-f210631f13ef . It can lower the native memory usage down to around 10G.

@wenleix wenleix changed the title Aggregation with structural type causes GC pressure Aggregation with structural type causes GC pressure and excessive cross region reference Feb 17, 2018
@wenleix wenleix reopened this Feb 23, 2018
@stale
Copy link

stale bot commented Feb 23, 2020

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.

@stale stale bot added the stale label Feb 23, 2020
@stale stale bot closed this as completed Mar 1, 2020
@tdcmeehan tdcmeehan reopened this Mar 2, 2020
@stale stale bot removed the stale label Mar 2, 2020
@stale
Copy link

stale bot commented Mar 2, 2022

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants