Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminates a no-op compaction upon snapshot release when disabling auto compactions #7267

Closed
wants to merge 6 commits into from

Conversation

Connor1996
Copy link
Contributor

@Connor1996 Connor1996 commented Aug 17, 2020

Summary:

After releasing a snapshot, it checks whether it is suitable to trigger bottom compactions.
When disabling auto compactions, it may still schedule compaction when releasing a snapshot. Whereas no compaction job will be actually handled, so the state of LSM is not changed and compaction will be triggered again and again every time releasing a snapshot.

Too frequent compactions lead to high CPU usage and high db_mutex lock contention which affects foreground write duration finally.

Test Plan:

  • make check
  • manual test

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@Connor1996 Connor1996 changed the title Fix trigger compaction endlessly when disable auto compactions Fix trigger compaction endlessly when disabling auto compactions Aug 17, 2020
@Connor1996
Copy link
Contributor Author

@ajkr PTAL

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! Can you mention it in "Bug Fixes" section in HISTORY.md?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, actually upon looking further I'm not able to repro the problem using the provided test (without the fix). The test causes one CFD to be enqueued for compaction unnecessarily. But no compaction gets picked due to

if (!mutable_cf_options->disable_auto_compactions && !cfd->IsDropped()) {
. Then I see one occurrence of "Compaction nothing to do" in the LOG, and then nothing further gets scheduled.

@Connor1996
Copy link
Contributor Author

Connor1996 commented Aug 21, 2020

Hm, actually upon looking further I'm not able to repro the problem using the provided test (without the fix). The test causes one CFD to be enqueued for compaction unnecessarily. But no compaction gets picked due to

if (!mutable_cf_options->disable_auto_compactions && !cfd->IsDropped()) {

. Then I see one occurrence of "Compaction nothing to do" in the LOG, and then nothing further gets scheduled.

Yes, this unit test is just to ensure that the compaction shouldn't be scheduled when disabling auto compaction. For the case, it has to get and release snapshot persistently which is typical for scan workload and causes a lot of Compaction nothing to do. But it is hard for a unit test to do that, so I only check it by manual.

Do you have any suggestion?

@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

@ajkr
Copy link
Contributor

ajkr commented Aug 21, 2020

Hm, actually upon looking further I'm not able to repro the problem using the provided test (without the fix). The test causes one CFD to be enqueued for compaction unnecessarily. But no compaction gets picked due to

if (!mutable_cf_options->disable_auto_compactions && !cfd->IsDropped()) {

. Then I see one occurrence of "Compaction nothing to do" in the LOG, and then nothing further gets scheduled.

Yes, this unit test is just to ensure that the compaction shouldn't be scheduled when disabling auto compaction. For the case, it has to get and release snapshot persistently which is typical for scan workload and causes a lot of Compaction nothing to do. But it is hard for a unit test to do that, so I only check it by manual.

Do you have any suggestion?

I would just suggest making it clear in the title/description/release note that this eliminates a no-op compaction upon snapshot release. "trigger compaction endlessly" sounds like there was an infinite loop somewhere, but my understanding now is that's not the case. Besides that, the fix and test LGTM!

@ajkr
Copy link
Contributor

ajkr commented Aug 21, 2020

Also regarding the unit test, you can add a change like this (https://gist.github.com/ajkr/7681bf61b1cdcf76e5510dc9ead2d482) to verify the fix is effective in preventing any no-op compaction.

@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

@Connor1996 Connor1996 changed the title Fix trigger compaction endlessly when disabling auto compactions Eliminates a no-op compaction upon snapshot release when disabling auto compactions Aug 24, 2020
@Connor1996
Copy link
Contributor Author

Thanks for your advice. PTAL again @ajkr

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. It looks like a lot of HISTORY.md changes related to the markdown.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@ajkr
Copy link
Contributor

ajkr commented Aug 24, 2020

LGTM. It looks like a lot of HISTORY.md changes related to the markdown.

Regarding the markdown changes, if you did intend to include them, would you mind separating them into another PR? Also, we've recently decided to adopt ISO 8601 date format (YYYY-MM-DD). So if you do decide to format all of HISTORY.md, it'd be nice to also change the dates (e.g., "6/12/2020" -> "2020-06-12").

@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

Signed-off-by: Connor1996 <zbk602423539@gmail.com>
@facebook-github-bot
Copy link
Contributor

@Connor1996 has updated the pull request. You must reimport the pull request before landing.

@Connor1996
Copy link
Contributor Author

@ajkr Not intend to do that, it's just changed automatically by markdown-lint. Already revert it.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajkr has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ajkr merged this pull request in 416943b.

codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
…to compactions (facebook#7267)

Summary:
After releasing a snapshot, it checks whether it is suitable to trigger bottom compactions.
When disabling auto compactions, it may still schedule compaction when releasing a snapshot. Whereas no compaction job will be actually handled, so the state of LSM is not changed and compaction will be triggered again and again every time releasing a snapshot.

Too frequent compactions lead to high CPU usage and high db_mutex lock contention which affects foreground write duration finally.

Pull Request resolved: facebook#7267

Test Plan:
- make check
- manual test

Reviewed By: akankshamahajan15

Differential Revision: D23252880

Pulled By: ajkr

fbshipit-source-id: 4431e071a35d9912a2a3592875db27bae521434b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants