Core: Add table-name filter for MetricsReporter#16574
Open
moomindani wants to merge 1 commit into
Open
Conversation
Add an optional filtering layer above any MetricsReporter implementation that drops ScanReports and CommitReports whose tableName() does not pass the configured include / exclude regex. Two new catalog properties control the filter: metrics-reporter.table-name.include and metrics-reporter.table-name.exclude. Both are Java regex patterns matched against the table name; when both are set, exclude wins over include. When neither property is set, CatalogUtil.loadMetricsReporter returns the underlying reporter unchanged, so the default code path incurs no runtime overhead. Empty values are treated as not set to avoid accidentally silencing all metrics on misconfiguration. Invalid regex values fail fast at catalog initialization with a clear error pointing at the offending property. The filter applies uniformly across all reporter implementations (LoggingMetricsReporter, RESTMetricsReporter, and custom user-supplied ones). Reports whose subtype does not expose a table name are forwarded without filtering. Closes apache#16573
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #16573.
Adds an optional filtering layer above any
MetricsReporterimplementation that dropsScanReportandCommitReportinstances whosetableName()does not pass the configured include / exclude regex. The filter applies uniformly toLoggingMetricsReporter,RESTMetricsReporter, and custom user-supplied reporters. The proposal surfaced in the dev@ DISCUSS thread for #16250 (per-table cardinality of the OTel reporter) and is intentionally scoped as cross-reporter, not OTel-specific.Design
CatalogUtil.loadMetricsReporterwraps the resolved reporter in aFilteringMetricsReporterwhen either of the new properties is set. When neither is set, the resolved reporter is returned unchanged — no wrapper instantiated, no runtime overhead on the default path.MetricsReportsubtypes that do not expose a table name (anything other thanScanReport/CommitReport) are forwarded without filtering.Configuration
Two new catalog properties:
Values are Java regex patterns matched against the table name. When both are set,
excludewins overinclude(an explicit deny overrides an include). Empty values are treated as not set to avoid accidentally silencing all metrics on misconfiguration. Invalid regex values fail fast at catalog initialization with a clear error pointing at the offending property.Behavior:
includeonly: forward reports whose table name matches; drop others.excludeonly: drop reports whose table name matches; forward others.excludematches; otherwise forward only ifincludematches.This mirrors the existing
route-regexpattern used iniceberg-kafka-connect(IcebergSinkConfig), where a user-supplied regex from configuration is compiled viaPattern.compile()and matched against incoming data. Same trust model: catalog property = admin-controlled.Disclosure
Per the project's AI-assisted contribution guidelines, I used Claude Code to help draft this work. I reviewed every change by hand and ran the full test/lint loop locally before opening this PR. The design and motivation discussion is in #16573.
cc @ebyhr @jbonofre — happy to address any feedback.