Skip to content

[SPARK-44929][TESTS] Standardize log output for console appender in tests#42627

Closed
yaooqinn wants to merge 1 commit intoapache:masterfrom
yaooqinn:SPARK-44929
Closed

[SPARK-44929][TESTS] Standardize log output for console appender in tests#42627
yaooqinn wants to merge 1 commit intoapache:masterfrom
yaooqinn:SPARK-44929

Conversation

@yaooqinn
Copy link
Member

@yaooqinn yaooqinn commented Aug 23, 2023

What changes were proposed in this pull request?

This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.

The original patterns are

  • %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
  • %t: %m%n%ex

And they're adjusted to the new consistent pattern

  • %d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n

Why are the changes needed?

In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.

For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs

Does this PR introduce any user-facing change?

no

How was this patch tested?

build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)

Was this patch authored or co-authored using generative AI tooling?

no

@yaooqinn
Copy link
Member Author

cc @dongjoon-hyun @JoshRosen @HyukjinKwon, thanks

appender.console.target = SYSTEM_ERR
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %t: %m%n%ex
appender.console.layout.pattern = %d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we mention this log style change in the PR description clearly? It's more than a simple truncation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in the section of PR desc, thanks @dongjoon-hyun

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-44929][TESTS] Truncate log output for console appender in tests [SPARK-44929][TESTS] Standardize log output for console appender in tests Aug 24, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I revised the PR title a little because this PR's contribution is bigger than a simple truncation. Thank you, @yaooqinn !

@yaooqinn
Copy link
Member Author

Thank you @dongjoon-hyun, the new title looks better.

@yaooqinn yaooqinn closed this in 8305001 Aug 24, 2023
yaooqinn added a commit that referenced this pull request Aug 24, 2023
…ests

### What changes were proposed in this pull request?

This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.

The original patterns are

- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex

And they're adjusted to the new consistent pattern

- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`

### Why are the changes needed?

In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.

For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```

```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #42627 from yaooqinn/SPARK-44929.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
yaooqinn added a commit that referenced this pull request Aug 24, 2023
…ests

### What changes were proposed in this pull request?

This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.

The original patterns are

- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex

And they're adjusted to the new consistent pattern

- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`

### Why are the changes needed?

In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.

For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```

```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```

### Was this patch authored or co-authored using generative AI tooling?

no

Closes #42627 from yaooqinn/SPARK-44929.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
@yaooqinn
Copy link
Member Author

Thank you, @dongjoon-hyun, for the suggestions and revisions.

Merged to master/3.5/3.4

@yaooqinn yaooqinn deleted the SPARK-44929 branch August 24, 2023 05:53
viirya pushed a commit to viirya/spark-1 that referenced this pull request Oct 19, 2023
…ests

### What changes were proposed in this pull request?

This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.

The original patterns are

- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex

And they're adjusted to the new consistent pattern

- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`

### Why are the changes needed?

In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.

For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs

### Does this PR introduce _any_ user-facing change?

no

### How was this patch tested?

```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```

```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
	at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
	at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
	at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```

### Was this patch authored or co-authored using generative AI tooling?

no

Closes apache#42627 from yaooqinn/SPARK-44929.

Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants