Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: recompute stats after mvcc gc #83194

Merged
merged 1 commit into from
Jul 22, 2022

Conversation

lunevalex
Copy link
Collaborator

@lunevalex lunevalex commented Jun 22, 2022

Touched #82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: Change the MVCC GC queue to recompute MVCC stats on a
range, if after doing a GC run it still thinks there is garbage in
the range.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@lunevalex lunevalex force-pushed the alex/mvcc-stats branch 3 times, most recently from bea7534 to 8557a29 Compare July 10, 2022 21:57
@lunevalex lunevalex marked this pull request as ready for review July 10, 2022 22:01
@lunevalex lunevalex requested a review from a team as a code owner July 10, 2022 22:01
@lunevalex
Copy link
Collaborator Author

@erikgrinaker I cleaned this up and added a test, do you want to take a look?

@erikgrinaker erikgrinaker self-requested a review July 12, 2022 16:55
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for handling this! Might deserve a brief release note?

Reviewed 3 of 3 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @lunevalex)


pkg/kv/kvserver/mvcc_gc_queue.go line 616 at r1 (raw file):

	scoreAfter := makeMVCCGCQueueScore(
		ctx, repl, repl.store.Clock().Now(), lastGC, conf.TTL(), canAdvanceGCThreshold)

We're reusing the old lastGC and canAdvanceGCThreshold here, so the score/decision isn't what it would be if the GC queue reprocessed it. That seems mostly beneficial, since it avoids issues with e.g. cooldown timers, but it might be worth calling out.


pkg/kv/kvserver/mvcc_gc_queue.go line 627 at r1 (raw file):

	// them then to spin the GC queue.
	if scoreAfter.ShouldQueue {
		log.Infof(ctx, "triggering stats re-computation")

Would be good to elaborate slightly here, and include the previous and current GC scores for debugging. Something like "GC still needed following GC, recomputing MVCC stats (old score %s; new score %s)". The score formatting is pretty verbose though, so see how it looks and consider dumping it on multiple lines or sth.


pkg/kv/kvserver/mvcc_gc_queue.go line 635 at r1 (raw file):

		err := repl.store.db.Run(ctx, &b)
		if err != nil {
			log.Errorf(ctx, "Failed to recompute stats with error=%s", err)

nit: start with lowercase.

Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: Change the MVCC GC queue to recompute MVCC stats on a
range, if after doing a GC run it still thinks there is garbage in
the range.
@lunevalex
Copy link
Collaborator Author

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 22, 2022

Build succeeded:

@craig craig bot merged commit d7cf6d2 into cockroachdb:master Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants