Skip to content

Conversation

@jiwen624
Copy link
Contributor

@jiwen624 jiwen624 commented May 23, 2024

What changes were proposed in this pull request?

For FileFormatDataWriter we currently record metrics of "task commit time" and "job commit time" in org.apache.spark.sql.execution.datasources.BasicWriteJobStatsTracker#metrics:

      TASK_COMMIT_TIME -> SQLMetrics.createTimingMetric(sparkContext, "task commit time"),
      JOB_COMMIT_TIME -> SQLMetrics.createTimingMetric(sparkContext, "job commit time"),

We may also record the time spent on "data write" (together with the time spent on producing records from the iterator), which is usually one of the major parts of the total duration of a writing operation.

Why are the changes needed?

We find that the write duration is very helpful for us to identify the bottleneck and time skew during the data write, and it also helps on the generic performance tuning.

Does this PR introduce any user-facing change?

Yes, in the SQL page of the Spark History Server (and live UI), a new "data write time" metric is shown on the data write command/operation nodes. For example, a InsertIntoHadoopFsRelationCommand node with the newly added data write time metric:
image

How was this patch tested?

Unit test case and manual tests.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label May 23, 2024

test("SPARK-34399: Add job commit duration metrics for DataWritingCommand") {
test("SPARK-34399: Add job commit duration metrics for DataWritingCommand" +
" and SPARK-48397: Data write time metric for DataWritingCommand") {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Piggyback on an existing test case to check the newly added metric of the metrics map from org.apache.spark.sql.execution.datasources.BasicWriteJobStatsTracker#metrics.

@jiwen624 jiwen624 changed the title [WIP][SPARK-48397][SQL] Add data write time metric to FileFormatDataWriter [SPARK-48397][SQL] Add data write time metric to FileFormatDataWriter May 23, 2024
@jiwen624
Copy link
Contributor Author

Hi @cloud-fan @dongjoon-hyun @HyukjinKwon @gengliangwang could you take a look when you get a chance and let me know your thoughts? Thank you very much.

@jiwen624
Copy link
Contributor Author

jiwen624 commented Jun 2, 2024

Soft ping @cloud-fan @dongjoon-hyun @HyukjinKwon @gengliangwang any thoughts on this? Thanks.

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Sep 11, 2024
@github-actions github-actions bot closed this Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant