Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed race when multiple clients attempt to connect to the same branc… #5330

Merged
merged 18 commits into from
Feb 9, 2023

Conversation

zachmu
Copy link
Member

@zachmu zachmu commented Feb 7, 2023

…h the first time a replica fetches it

@zachmu
Copy link
Member Author

zachmu commented Feb 9, 2023

This breaks a couple of the async push bats tests. I didn't change anything directly related to that, and adding sleeps makes it work again. Which makes me think that clean shutdown logic has been broken for some time, and we just passed some threshold where we are now losing the race.

@zachmu
Copy link
Member Author

zachmu commented Feb 9, 2023

It looks like graceful server shutdown on kill is just broken... there's no signal handling to invoke the shutdown logic, it just relies on a defer:

	sqlEngine, err := engine.NewSqlEngine(
		ctx,
		mrEnv,
		engine.FormatTabular,
		config,
	)
	if err != nil {
		return err, nil
	}
	defer sqlEngine.Close()

Edit: no, this works fine, confirmed with logging. Background threads are allegedly being canceled.

Copy link
Contributor

@max-hoffman max-hoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes are not incorrect, but I don't understand how the edge cases are triggered, and i'm not seeing tests that try to trigger optimistic lock failures.


// loop on optimistic lock failures
OPTIMISTIC_RETRY:
for {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be equivalent ditch the gotos and just break out of the for when we don't have a lock failure? otherwise continue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that break breaks out of the switch, not the loop. I'll see if I can eliminate the switch and make this easier to read.

if err != nil {
return "", false, err
}

if branchExists {
err = createLocalBranchFromRemoteTrackingBranch(ctx, srcDB.DbData(), ddb, branchName, remoteRef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof

case ref.BranchRefType:
err := rrd.createNewBranchFromRemote(ctx, remoteRef, trackingRef)
if errors.Is(err, datas.ErrOptimisticLockFailed) {
continue OPTIMISTIC_RETRY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a different rrd for every client? i'm confused how two clients could trigger this create retry, since the limiter is supposed to collapse the parallel calls. same for pullLocalBranch but that one seems more reasonable. What are the concurrent events that trigger retries? a write overlapping a pull?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any manifest update can cause an optimistic lock write failure. There are writes that take place outside a limiter run (specifically, during pullBranchesAndUpdateWorkingSet)

Actually the work in CreateLocalBranch needs similar protection for this reason, but I'll follow up for that.

@zachmu zachmu merged commit e13275b into main Feb 9, 2023
@tbantle22 tbantle22 deleted the zachmu/remote-ref-replication branch March 9, 2023 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants