Skip to content

HDDS-15209. Stabilize TestRocksDBCheckpointDiffer#testDifferWithDB for RocksDB layout variance#10363

Open
arunsarin85 wants to merge 5 commits into
apache:masterfrom
arunsarin85:HDDS-15209
Open

HDDS-15209. Stabilize TestRocksDBCheckpointDiffer#testDifferWithDB for RocksDB layout variance#10363
arunsarin85 wants to merge 5 commits into
apache:masterfrom
arunsarin85:HDDS-15209

Conversation

@arunsarin85
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

  • Wait until inflight compactions are drained before assertions that depend on a complete compaction DAG.
  • Stop using hard-coded SST id lists for diff expectations: derive the baseline diff per run via getSSTDiffList with the full column-family mask, then apply the same subset filtering as before; compare sorted file names to avoid ordering noise.
  • Column-family resolution for filtering: resolve from the compaction DAG or snapshot metadata; ignore SST ids that are not present on this run when building expectations (so golden lists don’t fight varying RocksDB numbering).
  • SST backup directory check: replace fixed .sst filenames with assertions on count, .sst suffix, and membership in getCompactionNodeMap() (and fix the reversed “expected vs actual” comparison that used the directory listing as “expected”).

Please describe your PR in detail:
TestRocksDBCheckpointDiffer#testDifferWithDB was flaky on CI because RocksDB does not always assign the same SST file numbers or compaction shape across runs and OSes. The test mixed timing (DAG not fully updated), hard-coded SST names for diff expectations and for SST backup links, and a misleading backup assertion where the filesystem listing was passed as expected and compared to a static list.

This change makes the test derive what it can from the same code under test (getSSTDiffList, compaction node map, backup dir contents) instead of pinning to one successful run’s numeric ids. It keeps the intent: compaction tracking runs, DAG-based diffs behave consistently across column-family subsets, and compaction-input backups exist and correspond to tracked SSTs.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15209

How was this patch tested?

flaky-test-check workflow with submodule=rocksdb-checkpoint-differ, test-class=org.apache.ozone.rocksdiff.TestRocksDBCheckpointDiffer, test-name=testDifferWithDB

https://github.com/arunsarin85/ozone/actions/runs/26268683979

@adoroszlai adoroszlai added test snapshot https://issues.apache.org/jira/browse/HDDS-6517 labels May 26, 2026
@adoroszlai adoroszlai requested a review from smengcl May 26, 2026 07:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517 test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants