Add lastCommittedSnapshotId commit metric and document missing metrics#7589
Add lastCommittedSnapshotId commit metric and document missing metrics#7589junmuz wants to merge 3 commits into
Conversation
87cc332 to
08324fc
Compare
|
@JingsongLi Can I get a review please. |
| <td>Distributions of the time taken by the last few scans.</td> | ||
| </tr> | ||
| <tr> | ||
| <td>lastScannedSnapshotId</td> |
There was a problem hiding this comment.
This metric has been added previously, but the docs doesn't reference it.
08324fc to
fe5c371
Compare
|
@JingsongLi @XiaoHongbo-Hope I have rebased the branch and resolved the conflicts. Can I get a review over it? Thanks. |
sure |
| if (strictModeChecker != null) { | ||
| strictModeChecker.update(newSnapshotId); | ||
| } | ||
| lastCommittedSnapshotId = newSnapshotId; |
There was a problem hiding this comment.
This misses the duplicate-success retry path. In false-success retry, the earlier duplicate-check branch returns before this assignment, so the gauge can stay -1.Should we also set lastCommittedSnapshotId = snapshot.id() there?
There was a problem hiding this comment.
Good catch, I have made changes to handle that.
b76be0f to
c57ef2f
Compare
…ommit Read the actual latest snapshot ID from SnapshotManager after commit instead of computing it from a pre-commit baseline, avoiding race conditions with concurrent writers. Also adds additional commit and source reader metrics.
- Fix lastCommittedSnapshotId to default to -1 and read actual snapshot ID post-commit instead of querying snapshotManager - Remove currentConsumerId metric from ConsumerProgressCalculator and ContinuousFileSplitEnumerator as lastScannedSnapshotId already exists - Add lastScannedSnapshotId and lastCommittedSnapshotId to metrics docs
c57ef2f to
9f0d40a
Compare
JingsongLi
left a comment
There was a problem hiding this comment.
Good observability improvement. lastCommittedSnapshotId is useful for monitoring commit progress and detecting lag.
Review:
-
Capturing snapshot ID directly from
FileStoreCommitImplafter successful commit avoids race conditions with concurrent writers. Good. -
Default
-1before any commit is reasonable — distinguishes "no commit yet" from snapshot 0. -
Documentation: Adding
lastScannedSnapshotIdandlastCommittedSnapshotIdto the metrics reference page is valuable for operators. -
+54/-11 is clean. Tests cover empty, non-empty, and compaction commits.
-
Minor: The Flink source reader metrics file is also touched — is
lastCommittedSnapshotIdexposed at both the source reader and commit sides? Ensure the semantics are clear (source tracks what it has read vs. commit tracks what was written).
LGTM.
|
Please rebase master |
Purpose
Tests
The metrics have been verified to be emitted correctly from Flink Jobs. Also new test cases have been added.