Skip to content

Conversation

anna-tran
Copy link
Contributor

What this PR does:
Updates the DynamoDB ring client to use versioning in it's KV store entries. Specifically

  • Updates the Batch operation to use TransactWriteItems instead of BatchWriteItems because the former allows versioned optimistic locking when there are concurrent updates to the same ring entry.
  • Updates the Batch operation to use a conditional expression on Put only if the version exists and matches what was previously queried.
  • Updates the Query operation to return a dynamodbItem struct which holds the data from the queried item along with a version. If there is no version from DynamoDB for the item, the version defaults to an empty string.

Which issue(s) this PR fixes:
Fixes #6986

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

Signed-off-by: Anna Tran <trananna@amazon.com>
Signed-off-by: Anna Tran <trananna@amazon.com>
Copy link
Contributor

@danielblando danielblando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work.

One last comment talked offline. We can inprove the retry if failed before of condition error.

Signed-off-by: Anna Tran <trananna@amazon.com>
Signed-off-by: Anna Tran <trananna@amazon.com>
Copy link
Contributor

@danielblando danielblando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

if item[version] != nil {
parsedVersion, err := strconv.ParseInt(*item[version].N, 10, 0)
if err != nil {
kv.logger.Log("msg", "failed to parse item version", "version", *item[version].N, "err", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could go wrong if we failed to parse the version? Seems we just log but ignore the error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version will be set to 0 if we cannot parse the version. This will prevent updates to the DDB entry so there won't be any overwrites that are out of date but the instance won't be able to join the ring. I think that's ok, operators can intervene if that happens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for us to track lets add a metric when we get condition check failure error back. It can be a new ddb metric

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion, updated

Signed-off-by: Anna Tran <trananna@amazon.com>
Signed-off-by: Anna Tran <trananna@amazon.com>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Signed-off-by: Daniel Blando <daniel@blando.com.br>
@danielblando danielblando merged commit 1a07a14 into cortexproject:master Aug 29, 2025
33 of 34 checks passed
@anna-tran anna-tran deleted the ring-cas branch August 29, 2025 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add versioning to DynamoDB KV store

3 participants