Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IOTDB-3189] Fix compaction is not well-distributed across sgs #6324

Merged
merged 2 commits into from
Jun 19, 2022

Conversation

THUMarkLau
Copy link
Contributor

@THUMarkLau THUMarkLau commented Jun 17, 2022

See IOTDB-3189.

In present IoTDB compaction task submitted order set by the ICompactionTaskComparator, currently this interface only has DefaultCompactionTaskComparator an implementation. DefaultCompactionTaskComparator compares the compaction task priority with following several aspects:

  • Whether the compaction priority of the system is Balance. If not, the prior one is inner space task or cross space task.
  • For inner space compaction, compare whether the tasks are sequence, compaction task level,compaction files version number, total number of compaction files, and total size of compaction files
  • For cross space compaction, compare the total number of files and the total size of them.

Different storage groups are not considered in this comparison. Therefore, when the tasks of different storage groups are compared side by side, it is easy to create uneven storage combinations. For example, if two SG groups commit a large number of compacton tasks in the sequence space with the same number of files, the same file size, and the same file level, the system may compact tasks in only one storage group due to the different submission timing.

To balance compaction tasks between different storage groups, when all other things being equal, each compaction task can be assigned a unique serial id that is counted individually by each virtual storage group and monotonically increases from zero. If two compaction tasks are from different storage groups, their sequence numbers are compared; the compaction task with a smaller sequence number has a higher priority. After adjustment, compaction priority comparison becomes

  • Whether the compaction priority of the system is Balance. If not, the prior one is inner space task or cross space task.
  • For inner space compaction, compare whether the tasks are sequence, compaction task level,compaction files version number, total number of compaction files, the serial number of compaction tasks, and the total size of files
  • For a cross space compaction, compare the total number of compaction files, the sequence number of compaction tasks, and the total size of files.

@qiaojialin qiaojialin merged commit e34850c into apache:master Jun 19, 2022
@qiaojialin qiaojialin deleted the IOTDB-3189 branch June 19, 2022 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants