Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: scope backfills to only the tables affected by a schema change #55135

Merged
merged 1 commit into from
Oct 9, 2020

Conversation

HonoreDB
Copy link
Contributor

@HonoreDB HonoreDB commented Oct 1, 2020

changefeedccl: scope backfills to only the tables affected by a schema change

Release note (bug fix):

Previously, if a changefeed was defined on multiple targets, backfills (if configured) would be run for every target after a schema change on just some.
This is unnecessary work.
This patch has the kv_feed identify affected spans after a schema change, and only kick off backfills on those spans. It also starts tracking the distinction in the changeFrontier in order to enable future enhancements (I don't need those changes yet so I can drop them if they're problematic).

@HonoreDB HonoreDB requested review from a team and pbardea and removed request for a team October 1, 2020 18:27
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@HonoreDB HonoreDB linked an issue Oct 1, 2020 that may be closed by this pull request
@dt dt changed the title Feature/scoped backfill changefeedccl: scope backfills to only the tables affected by a schema change Oct 1, 2020
@HonoreDB
Copy link
Contributor Author

HonoreDB commented Oct 1, 2020

The test failures here look like flakes, although I'll look into it, so I think this is actually ready for review.

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 3 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @pbardea)


pkg/ccl/changefeedccl/changefeed_processors.go, line 415 at r2 (raw file):

}

func (scb *schemaChangeBoundary) SetForSpan(span roachpb.Span, boundary hlc.Timestamp) {

Do these methods need to be exported?

Ah, this was updated in the following commit - let's move this change to the first commit?


pkg/ccl/changefeedccl/changefeed_processors.go, line 413 at r3 (raw file):

type schemaChangeBoundary struct {
	timestampsBySpan map[string]hlc.Timestamp

I think a more descriptive name here would help. Would something like resolvedTimestampBySpan work here?

Also I think that a brief comment about why we want to keep track of the resolved timestamp on a per-span basis would be useful here.


pkg/ccl/changefeedccl/changefeed_test.go, line 846 at r1 (raw file):

// Test schema changes that require a backfill on only some watched tables within a changefeed.
func TestChangefeedSchemaChangeBackfillScope(t *testing.T) {

We usually aim for each commit to have all tests passing (this makes it easier to do things like bisect when searching for a regression). Perhaps we could either move this to the last commit, or merge it in with the commit that allows this test to pass?


pkg/ccl/changefeedccl/kvfeed/kv_feed.go, line 226 at r3 (raw file):

		spansToBackfill = f.spans
	} else if len(events) > 0 {
		// Only backfill for the tables

Nit: i think that this comment needs re-wrapping (we try to wrap to 80 chars) since it was edited.

@pbardea pbardea requested a review from ajwerner October 6, 2020 15:46
@pbardea
Copy link
Contributor

pbardea commented Oct 6, 2020

@ajwerner I also added you since you're much more familiar with this area -- a look would be greatly appreciated!

@HonoreDB
Copy link
Contributor Author

HonoreDB commented Oct 6, 2020

Comments addressed, unrelated test failures have mysteriously vanished.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review. The kvfeed stuff looks good. The scan boundary refactor I have more concerns about. I'd prefer if you separated that into a separate PR so we can get the primary fix here merged and have a more focused discussion.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @HonoreDB, and @pbardea)


pkg/ccl/changefeedccl/changefeed_processors.go, line 412 at r4 (raw file):

)

type schemaChangeBoundary struct {

I'd love to understand the motivation behind this change? A reason to have more fine-grained tracking of these boundaries would be if we could avoid tearing down rangefeeds for tables not involved in backfills. That feels like a heavier lift than you are taking on here. I feel like the previous representation of a single boundary as the earliest boundary at least revealed in a pretty opaque way that when we have a multi-table kvfeed, we stop the processing of rangefeed events and checkpoints at the earliest of any boundaries of any tables.

I think that it's worth communicating a bit about why we have a scanBoundary concept all the way up in changefeed_processors to begin with. This was added in the 20.1 release cycle as a way to make the shutdown on schema change work properly such that we ensure that this higher level is able to checkpoint its progress before failing. We want to ensure that the kvfeed stops but that the higher level processing continues to gracefully record its progress before shutting down. See

var scErr schemaChangeDetectedError


pkg/ccl/changefeedccl/changefeed_processors.go, line 708 at r4 (raw file):

			// of which tables changed by decoding the span keys in the schemaChangeBoundary.
			cf.MoveToDraining(pgerror.Newf(pgcode.SchemaChangeOccurred,
				"schema change occurred at %v", cf.schemaChangeBoundary.arbitraryBoundary().Next().AsOfSystemTime()))

This should return the earliest boundary as that is the boundary leading to this error. If anything, this is the reason for re-abstracting the boundary, perhaps


pkg/ccl/changefeedccl/changefeed_processors.go, line 773 at r4 (raw file):

	boundaryForThisSpan := cf.schemaChangeBoundary.forSpan(resolved.Span)
	if resolved.BoundaryReached && (boundaryForThisSpan.IsEmpty() || resolved.Timestamp.Less(boundaryForThisSpan)) {
		cf.schemaChangeBoundary.setForSpan(resolved.Span, resolved.Timestamp)

unfortunately, I don't think this is sound given the logic of what produces these events. The idea with the boundary reached is that we're indicating that all spans underneath the kvfeed have reached the span boundary of one of the tables. It could be that the final checkpoint we propagate here corresponds to a different table than the one to which the scan boundary was due.


pkg/ccl/changefeedccl/changefeed_processors.go, line 777 at r4 (raw file):

	// If we've moved past a schemaChangeBoundary, make sure to clear it.
	if !resolved.BoundaryReached && !boundaryForThisSpan.IsEmpty() && boundaryForThisSpan.Less(resolved.Timestamp) {
		cf.schemaChangeBoundary.clearForSpan(resolved.Span)

Here I think we'd need to clear all of the scan boundaries. In general I'm just not sure that this refactor properly captures what's going on.

…a change

Previously, if a changefeed was defined on multiple targets, backfills (if
configured) would be run for every target after a schema change on just some.
This is unnecessary work.  This patch has the kv_feed identify affected spans
after a schema change, and only kick off backfills on those spans.

Release note (bug fix): Changefeeds defined on multiple tables will now
only backfill affected tables after a schema change.
@HonoreDB
Copy link
Contributor Author

HonoreDB commented Oct 8, 2020

Thanks @ajwerner! I've pulled the refactor out of this PR; will likely try it again once my understanding improves.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @HonoreDB and @pbardea)

@HonoreDB
Copy link
Contributor Author

HonoreDB commented Oct 9, 2020

bors r=ajwerner

@craig
Copy link
Contributor

craig bot commented Oct 9, 2020

Build succeeded:

@craig craig bot merged commit 357c20c into cockroachdb:master Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cdc: only backfill tables which experience schema changes
4 participants