[SPARK-44929][TESTS] Standardize log output for console appender in tests#42627
Closed
yaooqinn wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-44929][TESTS] Standardize log output for console appender in tests#42627yaooqinn wants to merge 1 commit intoapache:masterfrom
yaooqinn wants to merge 1 commit intoapache:masterfrom
Conversation
Member
Author
|
cc @dongjoon-hyun @JoshRosen @HyukjinKwon, thanks |
| appender.console.target = SYSTEM_ERR | ||
| appender.console.layout.type = PatternLayout | ||
| appender.console.layout.pattern = %t: %m%n%ex | ||
| appender.console.layout.pattern = %d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n |
Member
There was a problem hiding this comment.
Did we mention this log style change in the PR description clearly? It's more than a simple truncation.
Member
Author
There was a problem hiding this comment.
updated in the section of PR desc, thanks @dongjoon-hyun
...ce-managers/kubernetes/integration-tests/src/test/resources/log-config-test-log4j.properties
Show resolved
Hide resolved
dongjoon-hyun
approved these changes
Aug 24, 2023
Member
There was a problem hiding this comment.
I revised the PR title a little because this PR's contribution is bigger than a simple truncation. Thank you, @yaooqinn !
Member
Author
|
Thank you @dongjoon-hyun, the new title looks better. |
yaooqinn
added a commit
that referenced
this pull request
Aug 24, 2023
…ests
### What changes were proposed in this pull request?
This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.
The original patterns are
- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex
And they're adjusted to the new consistent pattern
- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`
### Why are the changes needed?
In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.
For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```
```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #42627 from yaooqinn/SPARK-44929.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
yaooqinn
added a commit
that referenced
this pull request
Aug 24, 2023
…ests
### What changes were proposed in this pull request?
This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.
The original patterns are
- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex
And they're adjusted to the new consistent pattern
- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`
### Why are the changes needed?
In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.
For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```
```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes #42627 from yaooqinn/SPARK-44929.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
Member
Author
|
Thank you, @dongjoon-hyun, for the suggestions and revisions. Merged to master/3.5/3.4 |
viirya
pushed a commit
to viirya/spark-1
that referenced
this pull request
Oct 19, 2023
…ests
### What changes were proposed in this pull request?
This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.
The original patterns are
- %d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n%ex
- %t: %m%n%ex
And they're adjusted to the new consistent pattern
- `%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%n`
### Why are the changes needed?
In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.
For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs
### Does this PR introduce _any_ user-facing change?
no
### How was this patch tested?
```
build/sbt "sql/testOnly *ParquetRebaseDatetimeV1Suite"
```
```
15:59:55.446 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Job job_202308230059551630377040190578321_1301 aborted.
15:59:55.446 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1301.0 (TID 1595)
org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.446 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources....
15:59:55.446 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 1301.0 failed 1 times; aborting job
15:59:55.447 ERROR org.apache.spark.sql.execution.datasources.FileFormatWriter: Aborting job 0ead031e-c9dd-446b-b20b-c76ec54978b1.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1301.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1301.0 (TID 1595) (10.221.97.38 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to file:/Users/hzyaoqin/spark/target/tmp/spark-67cce58e-dfb2-4811-a9c0-50ec4c90d1f1.
at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:765)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:420)
at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:890)
at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:890)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364)
15:59:55.579 ERROR org.apache.spark.executor.Executor: Exception in task 0.0 in stage 1303.0 (TID 1597)
```
### Was this patch authored or co-authored using generative AI tooling?
no
Closes apache#42627 from yaooqinn/SPARK-44929.
Authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
(cherry picked from commit 8305001)
Signed-off-by: Kent Yao <yao@apache.org>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR set a character length limit for the error message and a stack depth limit for error stack traces to the console appender in tests.
The original patterns are
And they're adjusted to the new consistent pattern
%d{HH:mm:ss.SSS} %p %c: %maxLen{%m}{512}%n%ex{8}%nWhy are the changes needed?
In testing, intentional and unintentional failures are created to generate extensive log volumes. For instance, a single FileNotFound error may be logged multiple times in the writer, task runner, task set manager, and other areas, resulting in thousands of lines per failure.
For example, tests in ParquetRebaseDatetimeSuite will be run with V1 and V2 Datasource, two or more specific values, and multiple configuration pairs. I have seen the SparkUpgradeException all over the CI logs
Does this PR introduce any user-facing change?
no
How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
no