Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per version set timestamp for when it was last made default in queue #4524

Merged
merged 1 commit into from Jun 21, 2023

Conversation

bergundy
Copy link
Member

What changed?

馃挜 Breaking Change (for a not yet released piece of code though)

Removed the default_update_timestamp on versioning data in favor of a per version set field queue_default_update_timestamp that denotes when it was last made default in the queue.

Why?

During testing, I discovered that we may mark a previously default build ID (vP) completely unreachable workflows immediately after a new queue default is added. This happens while there are no open workflows for vP's set and workflow tasks are currently being processed by vP workers. When those tasks complete, vP will be reachable again.

Having a per set timestamp for when it was last made queue default will allow us to be more careful when reporting build IDs as unreachable in the reachability API and the build ID scavenger.

How did you test it?

Refactored exiting tests.

Potential risks

Need to delete all user data from existing cluster deployments for any 1.21 rc release.

@bergundy bergundy requested a review from a team as a code owner June 21, 2023 16:44
// Deletions are performed by a background process which verifies build IDs are no longer in use and safe to delete (not yet implemented).
//
// Update may fail with FailedPrecondition if it would cause exceeding the supplied limits.
func UpdateVersionSets(clock hlc.Clock, data *persistencespb.VersioningData, req *workflowservice.UpdateWorkerBuildIdCompatibilityRequest, maxSets, maxBuildIds int) (*persistencespb.VersioningData, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this guy move?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closer to the implementation

return nil, serviceerror.NewInvalidArgument(fmt.Sprintf("version %s already exists and is not default in set", targetedVersion))
}
// Make the operation idempotent
return existingData, nil
return data, nil
}
return nil, serviceerror.NewInvalidArgument(fmt.Sprintf("%s requested to be made compatible with %s but both versions exist and are incompatible", targetedVersion, compatVer))
}

// First duplicate the build IDs to avoid mutation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong now that it's not being cloned

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

//nolint:revive // cyclomatic complexity
func updateImpl(timestamp hlc.Clock, existingData *persistencespb.VersioningData, req *workflowservice.UpdateWorkerBuildIdCompatibilityRequest) (*persistencespb.VersioningData, error) {
func updateImpl(timestamp hlc.Clock, data *persistencespb.VersioningData, req *workflowservice.UpdateWorkerBuildIdCompatibilityRequest) (*persistencespb.VersioningData, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mention that it now mutates data, and possibly shouldn't return a pointer any more since that can imply it's creating a new thing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It still needs to return a pointer because it might recreate the entire structure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or data would have to be a double pointer.

@bergundy bergundy force-pushed the set-timestamp branch 2 times, most recently from 809960e to 724af38 Compare June 21, 2023 17:19
Copy link
Member

@yiminc yiminc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rubber stamp here. :)

@bergundy bergundy merged commit cdf158e into temporalio:master Jun 21, 2023
9 checks passed
@bergundy bergundy deleted the set-timestamp branch June 23, 2023 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants