Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add consensus operation to restart shard transfer #3703

Merged
merged 3 commits into from
Feb 29, 2024

Conversation

timvisee
Copy link
Member

@timvisee timvisee commented Feb 27, 2024

Tracked in: #3477

Add a consensus operation to restart a shard transfer with a different configuration.

This will make falling back to a different shard transfer method a whole lot nicer. We make sure to properly clean up and prepare for the different shard transfer method. Without it, we're manually managing and update shard replica set states to make transfers happen which is very fragile. Restarting this way also makes sure that the actual current transfer method and progress is properly reported.

The operation ensures that:

  • a transfer is currently ongoing
  • the transfer configuration has changed (likely a different transfer method)

It is similar to calling abort/start right after each other, but in a single consensus operation.

If we'll merge this in 1.8, we'll be able to make use of it in Qdrant 1.9.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

Comment on lines 358 to 368
// Abort and start transfer
self.handle_transfer(
collection_id.clone(),
ShardTransferOperations::Abort {
transfer: transfer.key(),
reason: "restart transfer".into(),
},
)
.await?;
self.handle_transfer(collection_id, ShardTransferOperations::Start(transfer))
.await?;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I just call abort and start right after each other, as part of a single operation. It keeps the implementation very simple.

Maybe there's more to it though.

@timvisee timvisee marked this pull request as ready for review February 28, 2024 09:43
Copy link
Contributor

@ffuugoo ffuugoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks simple enough. LGTM.

Comment on lines +376 to +379
let mut new_transfer = transfer.clone();
// Preserve sync flag from the old transfer
new_transfer.sync = old_transfer.sync;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is new stuff from my side, might require extra look

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we do an OR, to sync if the old OR the new one has sync=true?

Suggested change
let mut new_transfer = transfer.clone();
// Preserve sync flag from the old transfer
new_transfer.sync = old_transfer.sync;
// Preserve sync flag from the old transfer
let mut new_transfer = transfer.clone();
new_transfer.sync |= old_transfer.sync;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_transfer.sync = old_transfer.sync || new_transfer.sync. Don't use bitwise ops for logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've accidentally hit the green button.

I'm handling this here: #3728

@timvisee timvisee merged commit e39c481 into dev Feb 29, 2024
17 checks passed
@timvisee timvisee deleted the consensus-restart-shard-transfer branch February 29, 2024 16:03
@timvisee timvisee restored the consensus-restart-shard-transfer branch February 29, 2024 16:05
@timvisee timvisee deleted the consensus-restart-shard-transfer branch February 29, 2024 16:08
timvisee added a commit that referenced this pull request Mar 5, 2024
* Add consensus operation to restart shard transfer with different config

* Require shard transfer restart to have a changed configuration

* implement api

---------

Co-authored-by: generall <andrey@vasnetsov.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants