New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proto and persistence definitions for replicated versioning #4172
Proto and persistence definitions for replicated versioning #4172
Conversation
go.mod
Outdated
@@ -142,3 +142,8 @@ require ( | |||
modernc.org/strutil v1.1.3 // indirect | |||
modernc.org/token v1.1.0 // indirect | |||
) | |||
|
|||
replace ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE that I'm merging this PR into the worker-versioning feature branch, we should keep these replacements until this Go SDK change is published and we can merge the API changes.
I know I'll need to provide a schema upgrade script for adding these new tables but for now let's just make sure that the definitions are correct. |
// If the request's hash matched, this variant is set (and will be true). | ||
bool matched_req_hash = 2; | ||
temporal.server.api.persistence.v1.TaskQueueUserData user_data = 1; | ||
int64 user_data_version = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to change this to a bool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this file from the PR, I'll add in a follow up PR though.
proto/internal/temporal/server/api/persistence/v1/task_queues.proto
Outdated
Show resolved
Hide resolved
|
||
// Container for all persistent user provided data for a task queue. | ||
message TaskQueueUserData { | ||
// Updated every time a new timestamp is generated to ensure the semantics of the Hybrid Logical Clock. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit confusing. "every time a new timestamp is generated" for what purpose?
Something like "Updated whenever any contained data is modified" is more straightforward, unless that's inaccurate in some way I don't follow.
schema/cassandra/temporal/schema.cql
Outdated
task_queue_name text, | ||
data binary, -- temporal.server.api.persistence.v1.TaskQueueUserData as proto binary blob | ||
version int64, -- Version of this row, used for optimistic concurrency | ||
updated_at timestamp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need the timestamp here if it's in the blob? For sorting or something?
Including the version column, it's duplicative of the HLC in the blob. Might we only have it here, and not need it in the blob?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can't be used for sorting really..
I initially wanted to use this when seeding the a new cluster with replication tasks to verify the table isn't updated in the meantime but I think it could also be useful as general metadata on the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might we remove it from the blob itself then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the same time, here's it's wall clock and on the struct it's a cluster local HLC.
Also note that the HLC only progresses when values are generated due to user interaction, not when applying replication events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also confused at why there are both and what the function of each one is
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just leave this out and see if we need it later.
I was thinking we could use this timestamp to verify that no new items are added after we seed the replication queue when a new cluster comes online.
Co-authored-by: Spencer Judge <sjudge@hey.com>
7bcf918
to
a96b861
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's lots of unrelated stuff in the diff.. can you rebase the branch manually (push without a PR) and then rebase this PR on top of that to have only the versioning-related stuff?
temporal.server.api.clock.v1.HybridLogicalClock state_update_timestamp = 3; | ||
} | ||
|
||
// An internal represenation of https://github.com/temporalio/api/blob/9206f1556469515d8e5a82c4d4bf21f2dd9730ce/temporal/api/taskqueue/v1/message.proto#L92 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write the fully qualified name instead of linking to a random commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, will do.
// An internal represenation of https://github.com/temporalio/api/blob/9206f1556469515d8e5a82c4d4bf21f2dd9730ce/temporal/api/taskqueue/v1/message.proto#L92 | ||
message CompatibleVersionSet { | ||
// Set IDs are used internally by matching. | ||
// The first set ID for a given set is the first added build ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no it's not, or at least not directly. just leave this out? it's not required to describe the semantics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll just leave it out.
I think it should be the first build ID, but let's discuss off PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be a hash of the first build id, perhaps. but no need to describe that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this line
// HLC timestamp representing when the state was updated or the when build ID was originally inserted. | ||
// (-- api-linter: core::0142::time-field-type=disabled | ||
// aip.dev/not-precedent: Using HLC instead of wall clock. --) | ||
temporal.server.api.clock.v1.HybridLogicalClock state_update_timestamp = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't use these timestamps for anything other than delete, right? should we consider having this be a dual-purpose field: "if absent, id is active. if present, it's deleted with this timestamp." ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because if a cluster rejects the deletion it may "resurrect" the build ID by setting a newer timestamp and state to ACTIVE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, okay. I guess you can do it by having two fields for an update timestamp and a delete timestamp, but there's no real advantage of that over this
// Updated whenever user data is directly updated due to a user action but not when applying replication events. | ||
// The clock is referenced when new timestamps are generated to ensure it produces monotonically increasing | ||
// timestamps. | ||
temporal.server.api.clock.v1.HybridLogicalClock clock = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what exactly is this one used for again? I thought the timestamps within versioning_data are enough to merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This keeps the last clock in order to produce a new clock on update.
We need this to ensure that the clock is monotonically increasing in a single cluster context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right. when I read reading above I was wondering where it stores the last used timestamp to avoid duplicates, but I didn't connect this to that. to be nitpicky: the comment above should say it's the last recorded cluster-local HLC timestamp for this task queue. another task queue might have a later one recorded (due to clock skew) but we never compare them across task queues so it doesn't matter.
schema/cassandra/temporal/schema.cql
Outdated
CREATE TABLE task_queue_user_data ( | ||
namespace_id uuid, | ||
task_queue_name text, | ||
data binary, -- temporal.server.api.persistence.v1.TaskQueueUserData as proto binary blob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add data_encoding
and change the above comment to not say it's a proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did this for the sql tables and forgot cassandra, thanks!
schema/cassandra/temporal/schema.cql
Outdated
task_queue_name text, | ||
data binary, -- temporal.server.api.persistence.v1.TaskQueueUserData as proto binary blob | ||
version int64, -- Version of this row, used for optimistic concurrency | ||
updated_at timestamp, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also confused at why there are both and what the function of each one is
Yeah, I updated the PR branch and forgot to update the base branch... first time trying this method... |
d05d728
to
0b71548
Compare
// HLC timestamp representing when the state was updated or the when build ID was originally inserted. | ||
// (-- api-linter: core::0142::time-field-type=disabled | ||
// aip.dev/not-precedent: Using HLC instead of wall clock. --) | ||
temporal.server.api.clock.v1.HybridLogicalClock state_update_timestamp = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, okay. I guess you can do it by having two fields for an update timestamp and a delete timestamp, but there's no real advantage of that over this
// An internal represenation of https://github.com/temporalio/api/blob/9206f1556469515d8e5a82c4d4bf21f2dd9730ce/temporal/api/taskqueue/v1/message.proto#L92 | ||
message CompatibleVersionSet { | ||
// Set IDs are used internally by matching. | ||
// The first set ID for a given set is the first added build ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it can be a hash of the first build id, perhaps. but no need to describe that here
// Updated whenever user data is directly updated due to a user action but not when applying replication events. | ||
// The clock is referenced when new timestamps are generated to ensure it produces monotonically increasing | ||
// timestamps. | ||
temporal.server.api.clock.v1.HybridLogicalClock clock = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right. when I read reading above I was wondering where it stores the last used timestamp to avoid duplicates, but I didn't connect this to that. to be nitpicky: the comment above should say it's the last recorded cluster-local HLC timestamp for this task queue. another task queue might have a later one recorded (due to clock skew) but we never compare them across task queues so it doesn't matter.
schema/cassandra/temporal/schema.cql
Outdated
data binary, -- temporal.server.api.persistence.v1.TaskQueueUserData | ||
data_encoding text, -- Encoding type used for serialization, in practice this should always be proto3 | ||
version int64, -- Version of this row, used for optimistic concurrency | ||
-- task_queue_name is not a part of the parititioning key to allow iterating all task queues in a single namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: even if it was part of the partition key, we could still iterate. it would just be more expensive
-- task_queue_name is not a part of the parititioning key to allow iterating all task queues in a single namespace. | |
-- task_queue_name is not a part of the parititioning key to allow cheaply iterating all task queues in a single namespace. |
message HybridLogicalClock { | ||
// Wall clock - A single time source MUST guarantee that 2 consecutive timestamps are monotonically non-decreasing. | ||
// e.g. by storing the last wall clock and returning max(gettimeofday(), lastWallClock). | ||
int64 wall_clock = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to document the units and epoch or does it not matter?
Addressed all outstanding comments and also made a few schema changes after running integration tests and finding a few flaws. |
I'm merging this. |
Note: This commit came from a feature branch and is not expected to build.
…io#4172) Note: This commit came from a feature branch and is not expected to build.
Note: This commit came from a feature branch and is not expected to build.
Note: This commit came from a feature branch and is not expected to build.
While designing replication for the worker versioning feature, we realized we need to make some schema changes and add a new table to track user provided data for a task queue.
When a new cluster comes online, we'll need to seed it with data from the
task_queue_user_data
table.This table will be scanned and replication tasks will be generated from each entry in the table.