Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: add Compactor trait to abstract the compaction #4097

Merged
merged 33 commits into from
Jun 17, 2024

Conversation

zyy17
Copy link
Collaborator

@zyy17 zyy17 commented Jun 3, 2024

I hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

The main purpose of the PR is to create a small module to encapsulate the logic of compaction without initializing the whole Mito engine.

It's very useful for creating the cli compact tool and making compaction a service.

For the compact tool that I'm working based on the PR, we can use the Compactor as the following code(greptime cli compact --region-id 4398046511104) :

...
#[async_trait]
impl Tool for Compact {
    async fn do_work(&self) -> Result<()> {
        let object_store_manager = self.create_object_store_manager();
        let compactor = DefaultCompactor {};
        let compaction_region =
            open_compaction_region(&self.request, self.data_home.as_str(), object_store_manager)
                .await
                .context(OpenCompactRegionSnafu)?;

        compactor
            .compact(compaction_region, self.request.compaction_options.clone())
            .await
            .context(CompactRegionSnafu)?;

        Ok(())
    }
}
...

Add the Compactor trait

The new file src/mito2/src/compaction/compactor.rs creates the new trait Compactor to encapsulate the compaction. The trait will split the compaction into multiple steps:

  • merge_ssts()
  • update_manifest()
  • compact(): The default implementation will combine the merge_ssts() and update_manifest().

The implementation of merge_ssts() and update_manifests() are from the original handle_compaction().

The PR also provides the DefaultCompactor, which is the default implementation of Compactor. The original compaction will be replaced by DefaultCompactor.

Refactor the Picker trait

To better collaborate with Compactor, the Picker output PickerOutput instead of creating the internal compaction task:

pub trait Picker: Debug + Send + Sync + 'static {
    fn pick(&self, current_version: VersionRef) -> Option<PickerOutput>;
}

Add open_compaction_region()

It's heavy to use RegionOpener::open() to open the region, so I decided to create a lightweight version: open_compaction_region(). The function will accept CompactionRequest and other necessary arguments and return CompactionRegion, which is a subset of MitoRegion.

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.
  • This PR requires documentation updates.

@github-actions github-actions bot added the docs-not-required This change does not impact docs. label Jun 3, 2024
@zyy17 zyy17 marked this pull request as ready for review June 4, 2024 07:46
@zyy17
Copy link
Collaborator Author

zyy17 commented Jun 4, 2024

@evenyag I'm unsure if it's appropriate to place the open_compaction_region()(maybe use another name) in RegionOpener? I try it before but finally still use current version in compactor.rs.

Copy link

codecov bot commented Jun 4, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 158 lines in your changes missing coverage. Please review.

Project coverage is 84.60%. Comparing base (2faa6d6) to head (f8d88a2).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4097      +/-   ##
==========================================
- Coverage   84.94%   84.60%   -0.34%     
==========================================
  Files        1018     1020       +2     
  Lines      178688   178846     +158     
==========================================
- Hits       151780   151310     -470     
- Misses      26908    27536     +628     

src/mito2/src/lib.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/task.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/task.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
@killme2008
Copy link
Contributor

@v0y4g3r Please take a look

src/mito2/src/compaction.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Show resolved Hide resolved
src/mito2/src/compaction/picker.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/picker.rs Show resolved Hide resolved
src/mito2/src/compaction/picker.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/task.rs Show resolved Hide resolved
@zyy17 zyy17 force-pushed the refactor/add-compactor branch 3 times, most recently from 8354f52 to 71191af Compare June 6, 2024 08:51
@zyy17
Copy link
Collaborator Author

zyy17 commented Jun 6, 2024

Note: I add some new important refactors beyond the code review comments:

  1. Move build_compaction_task() in CompactionScheduler;
  2. Use CompactionRegion as an argument of Picker;
  3. Put CompactionRegion into CompactionTaskImpl and make it simple;

@zyy17 zyy17 requested review from killme2008 and evenyag June 6, 2024 12:29
Copy link
Contributor

@killme2008 killme2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks GTM. It's better that let @v0y4g3r take a look.

src/mito2/src/compaction/task.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Outdated Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Show resolved Hide resolved
src/mito2/src/compaction/compactor.rs Show resolved Hide resolved
@v0y4g3r
Copy link
Contributor

v0y4g3r commented Jun 7, 2024

Regarding the interaction between database instance and compactor, it's better to put the pick and manifest update work in database instance, and the compactor focuses on merging SST files.

@zyy17 zyy17 requested review from evenyag and v0y4g3r June 7, 2024 10:55
@zyy17
Copy link
Collaborator Author

zyy17 commented Jun 7, 2024

@v0y4g3r @evenyag the latest version syncs the main branch, PTAL.

Copy link
Contributor

@evenyag evenyag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zyy17
Copy link
Collaborator Author

zyy17 commented Jun 7, 2024

Regarding the interaction between database instance and compactor, it's better to put the pick and manifest update work in database instance, and the compactor focuses on merging SST files.

I add the PickerOutput in CompactorRequest in the latest version.

For the real compactor service, the logic can be:

for rece_compactor_request().await {
    let compaction_region = open_compaction_region(compactor_request, ...);
    let merge_output = DefaultCompactor{}.merge_ssts(compaction_region, compactor_request);
    send_to_dn(merge_output);
}

When the datanode receives the response, it can use the merge_output and call update_manifest().

@v0y4g3r PTAL

Copy link
Contributor

@v0y4g3r v0y4g3r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Maybe there's no need to specifically add a send_to_dn method. The compactor just finish merging sst files and report the results and datanode can poll this result in its RemoteJobScheduler.

@v0y4g3r v0y4g3r enabled auto-merge June 16, 2024 16:15
@v0y4g3r v0y4g3r added this pull request to the merge queue Jun 17, 2024
Merged via the queue into GreptimeTeam:main with commit 5390603 Jun 17, 2024
49 checks passed
@zyy17 zyy17 deleted the refactor/add-compactor branch June 17, 2024 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-not-required This change does not impact docs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants