Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default Compaction strategy is sub-optimal #1033

Closed
keith-turner opened this issue Mar 13, 2019 · 1 comment

Comments

@keith-turner
Copy link
Contributor

commented Mar 13, 2019

Consider a tablet with the following files. If the compaction ratio is 3 then all files would meet the criteria for compaction. However if the max files to compact is 10, then the files C4 and F[5-d] will be selected for compaction. This is very suboptimal over time. It would be much better if a subset of files that met the compaction ratio criteria were returned. For example C[2-4] and F[5-b] could be selected, which is 10 files that meet the ratio criteria. Another possibility is only selecting only the F files, which meet the criteria and is less than max files.

File Size
C1 100M
C2 100M
C3 100M
C4 100M
F5 1M
F6 1M
F7 1M
F8 1M
F9 1M
Fa 1M
Fb 1M
Fc 1M
Fd 1M

@keith-turner keith-turner self-assigned this Mar 13, 2019

@keith-turner

This comment has been minimized.

Copy link
Contributor Author

commented Mar 13, 2019

The problem is the code finds a set of file that meet the compaction ratio criteria and then takes the 10 smallest files from that set. It would be much better if the code searched for a set of files that meet the ratio criteria and is less than or equal to the max files. I think doing this could result in much less work over time.

keith-turner added a commit to keith-turner/accumulo that referenced this issue Mar 15, 2019

keith-turner added a commit to keith-turner/accumulo that referenced this issue Mar 26, 2019

keith-turner added a commit to keith-turner/accumulo that referenced this issue Mar 26, 2019

@ctubbsii ctubbsii added the v2.0.0 label Jun 14, 2019

@ctubbsii ctubbsii added this to To do in 2.0.0 via automation Jun 14, 2019

@ctubbsii ctubbsii moved this from To do to Done in 2.0.0 Jun 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
2 participants
You can’t perform that action at this time.