Skip to content

Core: Add table-level filtering for MetricsReporter implementations #16573

@moomindani

Description

@moomindani

Feature Request / Improvement

Add a built-in mechanism that lets users restrict which tables' ScanReports and CommitReports are forwarded to a configured MetricsReporter, applied uniformly across all reporter implementations (LoggingMetricsReporter, RESTMetricsReporter, OtelMetricsReporter, and custom user-supplied ones).

Motivation

In deployments with many tables, users frequently want to emit metrics for only a subset:

  • Only tables in production databases (e.g., prod.*), not staging or sandbox
  • Only specific business-critical tables, excluding intermediate ones
  • Exclude noisy test or scratch tables (tmp.*, *.bench_*)

Existing per-reporter knobs only partially address this. The iceberg.otel.metrics.attributes allowlist added in #16250 controls which attributes an OTel metric carries — useful for cardinality but does not stop metrics from being emitted for tables the user doesn't care about. Cardinality-control mechanisms in time-series backends (OTel Views, Prometheus relabel rules, etc.) are reporter-specific and require host-side knowledge.

Table-level filtering is a cross-cutting concern that belongs above any single reporter. Putting it inside each reporter implementation would lead to repeated, slightly inconsistent flag sets per reporter. Putting it once in the framework layer means every existing and future MetricsReporter benefits without re-implementation.

Proposal

Introduce two catalog properties recognized by the catalog when constructing the reporter pipeline:

metrics-reporter-impl=org.apache.iceberg.metrics.OtelMetricsReporter
metrics-reporter.table-name.include=prod\..*
metrics-reporter.table-name.exclude=.*\.tmp_.*

Values are Java regex patterns matched against ScanReport.tableName() / CommitReport.tableName(). The catalog wraps the user's reporter in a filtering layer when either property is present. When both are present, exclude wins over include (an explicit deny overrides an include). When neither is set, behavior is identical to today (pass-through, with no runtime overhead).

Behavior

  • include only set: forward reports whose table name matches; drop others.
  • exclude only set: drop reports whose table name matches; forward others.
  • Both set: drop if exclude matches; otherwise forward only if include matches.
  • Neither set: forward everything (current behavior).
  • Empty value (metrics-reporter.table-name.include=) is treated as "not set" rather than "match nothing" to avoid accidentally silencing all metrics on misconfiguration.

Relationship to existing work

Query engine

None — applies to all engines that consume MetricsReporter.

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions