[WIP][SPARK-56551][SQL] Add operation metrics for metadata-only DELETE queries in DSv2 by ZiyaZa · Pull Request #55430 · apache/spark

ZiyaZa · 2026-04-20T17:04:50Z

What changes were proposed in this pull request?

Added numDeletedRows metric for metadata-only DELETEs.

Why are the changes needed?

For better visibility into what happened as a result of an DELETE query.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Added metric value validation to most DELETE unit tests.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Opus 4.7

…e operation column

# Conflicts: # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteRowLevelCommand.scala # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/RewriteUpdateTable.scala # sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/RowDeltaUtils.scala # sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala

# Conflicts: # sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala # sql/core/src/test/scala/org/apache/spark/sql/connector/DeltaBasedNoMetadataUpdateTableSuite.scala

aokolnychyi · 2026-04-20T21:06:31Z

+   * Returns an array of supported custom metrics with name and description.
+   * By default it returns empty array.
+   */
+  default CustomMetric[] supportedCustomMetrics() {


Are these the metrics that will be routed to output of the commands?

Not to the output, just to plan node metrics. I took a similar approach to what we already had in couple of places (hence the refactored code in this PR). This allows connectors to expose any metrics that will be visible in the plan node.

As for the output of the command, Spark will only use the metrics it understands and expects. In this case, we have only numDeletedRows metric to be exposed as command output, and for this connectors need to use NumDeletedRowsMetric.

aokolnychyi · 2026-04-20T21:12:28Z

+ * @since 4.2.0
+ */
+@Evolving
+public class NumDeletedRowsMetric extends CustomSumMetric {


I am not sure I fully understand this.

I thought the goal was to allow connectors to expose a set of metrics (whatever those may be) as output without defining any specific custom metrics on the Spark side?

Connectors can still expose a set of metrics, this is just one that Spark understands and will expose as command output. Spark needs to specify the name of the metric somehow for all connectors to use, otherwise each connector can come up with it's own naming scheme and Spark wouldn't know what to look for.

Here we could just have a String constant somewhere to store the expected metric name numDeletedRows. Instead I went with this class definition because it seems all connectors will want to use some class like this. But if I misunderstood, I can replace it with just a String constant.

hm. yea i also wondering, does this particular metric need to flow to Spark?

connector knows how many rows were removed by metadata-only delete and can just persist it in the summary? Is it needed in spark somewhere

Spark will need to send this value to the command output when we implement that functionality.

got it, thanks

ZiyaZa added 23 commits April 1, 2026 17:29

Add operation metrics for UPDATE queries in DSv2

0755e67

Add comment

7336546

Address comments

e58f003

Remove IncrementMetric, compute metrics via additional attribute

b8fcfb4

Revert unnecessary change

5e12e94

Address comments

6eb4407

Address comments

9757ba7

Add 2 more tests

072df0f

-1 if missing, add RowLevelWriteExec

2b100a3

Replace __is_updated with operation column

f54923b

Fix ReplaceData DML without metadata attributes not projecting out th…

c70e1da

…e operation column

Rename WRITE_OPERATION and WRITE_WITH_METADATA_OPERATION

8b37604

Tests without metadata attributes

f69f4e6

Fix comment

f70cc22

Remove WRITE_OPERATION and instead use fine-grained operations

cae5e1e

Resolve conflicts

268c29c

Address comments

69b2c42

Merge branch 'dml-no-metadata' into dsv2-update-metrics

d73381d

Merge branch 'master' into dsv2-update-metrics

f22c999

# Conflicts: # sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/WriteToDataSourceV2Exec.scala # sql/core/src/test/scala/org/apache/spark/sql/connector/DeltaBasedNoMetadataUpdateTableSuite.scala

DELETE metrics for WriteDelta

551a701

DELETE metrics for ReplaceData

0d6c5f4

Metadata-only DELETE Metrics

8b38569

aokolnychyi reviewed Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP][SPARK-56551][SQL] Add operation metrics for metadata-only DELETE queries in DSv2#55430

[WIP][SPARK-56551][SQL] Add operation metrics for metadata-only DELETE queries in DSv2#55430
ZiyaZa wants to merge 23 commits intoapache:masterfrom
ZiyaZa:dsv2-metadata-delete-metrics

ZiyaZa commented Apr 20, 2026

Uh oh!

aokolnychyi Apr 20, 2026

Uh oh!

ZiyaZa Apr 21, 2026

Uh oh!

aokolnychyi Apr 20, 2026 •

edited

Loading

Uh oh!

ZiyaZa Apr 21, 2026

Uh oh!

szehon-ho Apr 22, 2026

Uh oh!

ZiyaZa Apr 22, 2026

Uh oh!

szehon-ho Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ZiyaZa commented Apr 20, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

aokolnychyi Apr 20, 2026

Choose a reason for hiding this comment

Uh oh!

ZiyaZa Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

aokolnychyi Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZiyaZa Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

ZiyaZa Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

szehon-ho Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aokolnychyi Apr 20, 2026 •

edited

Loading