Skip to content

Speed up ITs by sharing the cluster across test methods#17682

Merged
JackieTien97 merged 2 commits into
masterfrom
ty/IT-opt
May 16, 2026
Merged

Speed up ITs by sharing the cluster across test methods#17682
JackieTien97 merged 2 commits into
masterfrom
ty/IT-opt

Conversation

@JackieTien97
Copy link
Copy Markdown
Contributor

Motivation

Most ITs under integration-test/src/test/java/org/apache/iotdb/db/it/ use @Before/@After to start and stop the cluster on every test method. For a class with N tests this means N full cluster start/stop cycles (~5s each), which dominates wall time on management/syntax/auth-style suites where the actual test body finishes in tens of milliseconds.

For example, IoTDBSimpleQueryIT has 36 tests; previously it spent roughly 36 × 5s ≈ 3 min just on cluster lifecycle. After this change the same suite finishes in ~14s.

Changes

Converted 37 ITs to start the cluster once per class:

  • @Before@BeforeClass public static void setUp()
  • @After@AfterClass public static void tearDown()
  • Each test wraps its body in try { ... } finally { ... } and cleans up the state it created (drop database / drop function / drop user / drop role / revoke privileges / drop trigger / drop template / etc.) via small static helpers like dropFunctionQuietly, executeQuietly, or cleanupAfterTest that swallow expected SQLExceptions.
  • Where setup populated shared seed data (e.g. trigger timeseries, schema fixtures), the seed runs once in @BeforeClass.
  • For IoTDBMQTTServiceJsonIT the cluster moves to @BeforeClass/@AfterClass while per-test MQTT connection setup stays in @Before/@After (each test still gets its own connection).

Tests intentionally left on @Before/@After

These have semantics that are incompatible with sharing a cluster across tests:

  • Tests that restart or kill nodes — IoTDBRecoverIT, IoTDBRecoverUnclosedIT, IoTDBRestartIT, IoTDBRestartRatisIT, IoTDBVerifyConnectionIT
  • Tests that initialize the cluster with different topologies/configs per method — IoTDBLoadLastCacheIT (parameterized), IoTDBLoadTsFileIT, metric/IoTDBMetricIT, iotconsensusv2/IoTDBIoTConsensusV23C3DBasicITBase
  • Tests that depend on a fresh global counter or accumulate global state — quotas/IoTDBSpaceQuotaIT, audit/IoTDBAuditLogBasicIT, schema/IoTDBDeactivateTemplateIT, schema/IoTDBMetadataFetchIT, schema/IoTDBSchemaTemplateIT, IoTDBInsertWithQueryIT, aligned/IoTDBAlignedDataDeletionIT
  • Tests that mutate root credentials and could brick the cluster on failure — auth/IoTDBUserRenameIT
  • Parameterized base class where @BeforeClass would run before @Parameterized.BeforeParam (cluster not yet up) — schema/quota/IoTDBClusterQuotaIT
  • Two-test classes whose tests both CREATE USER user1 and assert "no data → empty" on the same path — schema/regionscan/IoTDBActiveSchemaQueryIT

Verification

Spotless and mvn test-compile pass on the integration-test module. Ran 4 batches of representative ITs locally:

Batch Tests Failures Skipped
1 (UDF + SyntaxId + ExecuteBatch) 31 0 0
2 (SimpleQuery + Aligned + Trigger + auth + StringLiteral + ...) 85 0 2 (@Ignore)
3 (single-test ITs + LoadTsFileWithMod + SortedShowTimeseries) 28 0 0
4 (Grafana + REST + MQTT + auth) 20 0 0
Total 164 0 2

Each @Before/@after pair restarts the cluster for every test method, which
makes management/syntax/auth-style ITs spend most of their wall time on
cluster start/stop. Convert 37 ITs under integration-test/src/test/java/
org/apache/iotdb/db/it to start the cluster once per class, with each test
cleaning up its own state via try/finally and quietly-ignoring helpers
(e.g. dropFunctionQuietly, executeQuietly, cleanupAfterTest).

Tests that mutate cluster topology, restart nodes, depend on accumulating
global state (audit log), or rotate root credentials are intentionally left
on @Before/@after.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 40.42%. Comparing base (556cd66) to head (d7149e8).
⚠️ Report is 7 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17682      +/-   ##
============================================
+ Coverage     40.38%   40.42%   +0.04%     
  Complexity     2574     2574              
============================================
  Files          5178     5179       +1     
  Lines        349134   349261     +127     
  Branches      44665    44683      +18     
============================================
+ Hits         140988   141205     +217     
+ Misses       208146   208056      -90     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

User IDs are auto-incremented globally and do not reset when users
are dropped between tests. With @BeforeClass/@afterclass sharing
one cluster, the cleanup drops users but the ID counter keeps
incrementing, causing assertions on specific IDs (10000, 10001, etc.)
to fail.

Replace user ID assertions with username-only checks in three places:
- listUserPrivileges: remove hardcoded '10000' check
- listUserPrivileges: LIST USER OF ROLE now checks username only
- testCreateUserAndRole: compare username sets instead of userId,User pairs
@sonarqubecloud
Copy link
Copy Markdown

@JackieTien97 JackieTien97 merged commit 3145e83 into master May 16, 2026
29 of 30 checks passed
@JackieTien97 JackieTien97 deleted the ty/IT-opt branch May 16, 2026 02:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant