Skip to content

disagg: Fix unexpected object storage usage caused by pre-lock residue (#10760)#10767

Merged
ti-chi-bot[bot] merged 1 commit intopingcap:release-nextgen-20251011from
ti-chi-bot:cherry-pick-10760-to-release-nextgen-20251011
Mar 25, 2026
Merged

disagg: Fix unexpected object storage usage caused by pre-lock residue (#10760)#10767
ti-chi-bot[bot] merged 1 commit intopingcap:release-nextgen-20251011from
ti-chi-bot:cherry-pick-10760-to-release-nextgen-20251011

Conversation

@ti-chi-bot
Copy link
Copy Markdown
Member

This is an automated cherry-pick of #10760

What problem does this PR solve?

Issue Number: close #10763

Problem Summary:

  • In concurrent remote write paths, PageDirectory write-group semantics could cause follower writers to miss their own applied lock-id cleanup signals.
  • As a result, S3LockLocalManager.pre_lock_keys could remain resident and be repeatedly written into manifest locks.
  • S3GC then treated many obsolete objects as still protected, leading to long-term remote storage usage inflation.

What is changed and how it works?

disagg: eliminate pre-lock key residue that lead to unexpected OSS usage
  • End-to-end correctness fixes for lock lifecycle

    • PageDirectory::apply now returns writer-scoped applied_data_files for both write-group owner and followers, so each writer gets its own cleanup signal.
    • UniversalPageStorage::write uses those per-writer ids to clean pre-locks reliably after apply.
    • Added explicit failure cleanup path: cleanPreLockKeysOnWriteFailure(...) is invoked when remote write/apply fails.
    • createS3LockForWriteBatch was adjusted to avoid partial pre-lock residue on partial lock-creation failures (append to pre_lock_keys after lock-creation pass), and its return value is now aligned with "newly appended keys" semantics.
  • Test coverage and regression guards

    • Added write-group concurrency tests in PageDirectory and UniversalPageStorage paths.
    • Added focused S3LockLocalManager tests for partial cleanup, failure cleanup, lock-return semantics, and partial-failure atomicity.
    • Updated SyncPoint-based async tests to use std::launch::async to avoid deferred scheduling risk.
  • Observability and operations improvements

    • Most observability change are split into seperate PR disagg: Add O11y on object store usage summary of each tiflash store #10764 to keep this logical changes clean
    • Added lock-manager metrics to track pre-lock residency and cleanup outcomes (hit/miss/remaining).
    • Added owner-only periodic S3 storage summary in S3GCManagerService.
    • Added per-store S3 summary gauge:
      • tiflash_storage_s3_store_summary_bytes{store_id, type=data_file_bytes|dt_file_bytes}
    • Added setting remote_summary_interval_seconds and wired it through TMTContext; <= 0 disables periodic summary task registration.
    • Updated Grafana panels for the new S3 summary metric.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
# Run chbenchmark workload and check the metrics of `prelock_keys` and OSS usage
tiup bench ch --host 10.2.12.81 -P 8081 --warehouses 8000 run -D chbenchmark8k -T 50 -t 0 --time 30m --ignore-error --queries q1
# Before the fix, from 23:29 to 00:00, the number of prelock_keys in memory would accumulate and increase with the write load; after the fix, from 02:00 to 02:30, there was no longer any persistent residue of prelock_keys in memory.
# Also can check the new added grafana panel "Remote Store Summary (Disagg arch)"
image image
  • No code

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Fix an issue in disaggregated remote-write paths where pre-lock keys could remain resident under write-group concurrency or partial failure, causing S3GC to retain obsolete objects and inflate remote storage usage. Also add configurable periodic S3 storage summary and per-store summary metrics.

Summary by CodeRabbit

  • Bug Fixes

    • Resolved S3 pre-lock key cleanup on write failures to prevent orphaned lock keys.
    • Improved remote write error handling with enhanced exception logging.
  • New Features

    • Added S3 lock manager metrics for monitoring lock creation, cleanup, and status.
    • Extended S3 store summary metrics tracking.
  • Improvements

    • Enhanced checkpoint operation logging for better visibility.
    • Refined concurrent write batch processing with improved lock key tracking.

Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-nextgen-20251011 labels Mar 23, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

🗂️ Base branches to auto review (3)
  • release-8.5
  • release-7.5
  • release-8.1

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8b49cbf3-ec25-4022-b6b2-7d6cd3fd05f6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Mar 25, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 25, 2026

@yinshuangfei: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 25, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 25, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-25 01:09:16.211741335 +0000 UTC m=+316952.247811595: ☑️ agreed by JaySon-Huang.
  • 2026-03-25 01:28:34.444321153 +0000 UTC m=+318110.480391413: ☑️ agreed by JinheLin.

@ti-chi-bot ti-chi-bot bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels Mar 25, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 25, 2026

@kolafish: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot bot commented Mar 25, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CalvinNeo, JaySon-Huang, JinheLin, kolafish, yinshuangfei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [CalvinNeo,JaySon-Huang,JinheLin]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot merged commit f050924 into pingcap:release-nextgen-20251011 Mar 25, 2026
5 checks passed
@ti-chi-bot ti-chi-bot bot deleted the cherry-pick-10760-to-release-nextgen-20251011 branch March 25, 2026 03:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cherry-pick-approved Cherry pick PR approved by release team. lgtm release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. type/cherry-pick-for-release-nextgen-20251011

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants