Skip to content

Fixed TTL problems#17735

Merged
CRZbulabula merged 7 commits into
masterfrom
ttl-fix
May 24, 2026
Merged

Fixed TTL problems#17735
CRZbulabula merged 7 commits into
masterfrom
ttl-fix

Conversation

@Caideyipi
Copy link
Copy Markdown
Collaborator

Description

This PR fixes two correctness issues in ConfigNode TTL handling.

  1. TTL rule capacity checking incorrectly treated updates to existing TTL rules as new rules.
  2. SetTTLProcedure did not rollback TTL state when ConfigNode metadata had been updated but DataNode TTL cache
    update failed.

Details

1. Fix TTL rule capacity accounting

TTLInfo now checks the number of newly introduced TTL rules instead of simply checking the current total rule count.

This also fixes the database TTL case: setting TTL on a database effectively writes both the database path and its
** wildcard path, so capacity validation now accounts for both entries.

As a result:

  • updating an existing TTL rule no longer fails when the capacity limit is already reached
  • database-level TTL no longer bypasses the real capacity requirement

2. Add rollback for SetTTLProcedure

SetTTLProcedure now captures the previous TTL state before writing the new value to ConfigNode.

If the procedure fails while updating DataNode TTL cache, it will rollback:

  • the TTL metadata on ConfigNode
  • the TTL cache on DataNodes

For database TTL, the rollback also restores the wildcard TTL entry (db.**) separately.

The rollback state is serialized with the procedure so that recovery/replay can still restore the previous TTL state
correctly.

Tests

Added/updated tests for:

  • updating existing TTL when capacity is reached
  • updating existing TTL when current state is already oversize
  • database TTL capacity accounting
  • rollback when previous TTL does not exist
  • rollback of database wildcard TTL
  • procedure serialization with captured rollback state

This PR has:

  • been self-reviewed.
    • concurrent read
    • concurrent write
    • concurrent read and write
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods.
  • added or updated version, license, or notice information
  • added comments explaining the "why" and the intent of the code wherever would not be obvious
    for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold
    for code coverage.
  • added integration tests.
  • been tested in a test IoTDB cluster.

Key changed/added classes (or packages if there are too many classes) in this PR

@codecov
Copy link
Copy Markdown

codecov Bot commented May 21, 2026

Codecov Report

❌ Patch coverage is 73.13433% with 36 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.62%. Comparing base (7563ac8) to head (d0a0568).
⚠️ Report is 5 commits behind head on master.

Files with missing lines Patch % Lines
...fignode/procedure/impl/schema/SetTTLProcedure.java 70.08% 35 Missing ⚠️
...rg/apache/iotdb/confignode/manager/TTLManager.java 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17735      +/-   ##
============================================
+ Coverage     40.55%   40.62%   +0.07%     
  Complexity     2574     2574              
============================================
  Files          5179     5179              
  Lines        349896   350085     +189     
  Branches      44727    44765      +38     
============================================
+ Hits         141890   142217     +327     
+ Misses       208006   207868     -138     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Review superseded by the English version below.)

Copy link
Copy Markdown
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the capacity-accounting and the rollback-on-DataNode-failure fixes look correct. Comments below are organized by code location and focus on the timing of state capture in SetTTLProcedure, the cost of the consensus read used to capture it, and a handful of readability/test improvements.

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@CRZbulabula CRZbulabula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@CRZbulabula CRZbulabula merged commit cc108e7 into master May 24, 2026
44 checks passed
@CRZbulabula CRZbulabula deleted the ttl-fix branch May 24, 2026 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants