[SPARK-31295][DOC] Supplement version for configuration appear in doc#28064
[SPARK-31295][DOC] Supplement version for configuration appear in doc#28064beliefer wants to merge 1 commit intoapache:masterfrom
Conversation
| <td> | ||
| The maximum number of completed applications to display. Older applications will be dropped from the UI to maintain this limit.<br/> | ||
| </td> | ||
| <td>0.8.0</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: 46eecd1#diff-29dffdccd5a7f4c8b496c293e87c8668
| <td> | ||
| The maximum number of completed drivers to display. Older drivers will be dropped from the UI to maintain this limit.<br/> | ||
| </td> | ||
| <td>1.1.0</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: 7446f5f#diff-29dffdccd5a7f4c8b496c293e87c8668
| to consolidate them onto as few nodes as possible. Spreading out is usually better for | ||
| data locality in HDFS, but consolidating is more efficient for compute-intensive workloads. <br/> | ||
| </td> | ||
| <td>0.6.1</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: bb2b9ff#diff-0e7ae91819fc8f7b47b0f97be7116325
| Set this lower on a shared cluster to prevent users from grabbing | ||
| the whole cluster by default. <br/> | ||
| </td> | ||
| <td>0.9.0</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: d8bcc8e#diff-29dffdccd5a7f4c8b496c293e87c8668
| <code>-1</code>. | ||
| <br/> | ||
| </td> | ||
| <td>1.6.3</td> |
There was a problem hiding this comment.
SPARK-16956, commit ID: ace458f#diff-29dffdccd5a7f4c8b496c293e87c8668
| <td> | ||
| Amount of a particular resource to use on the worker. | ||
| </td> | ||
| <td>3.0.0</td> |
There was a problem hiding this comment.
SPARK-27371, commit ID: cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8
| Path to resource discovery script, which is used to find a particular resource while worker starting up. | ||
| And the output of the script should be formatted like the <code>ResourceInformation</code> class. | ||
| </td> | ||
| <td>3.0.0</td> |
There was a problem hiding this comment.
SPARK-27371, commit ID: cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8
| enabled). You should also enable <code>spark.worker.cleanup.enabled</code>, to ensure that the state | ||
| eventually gets cleaned up. This config may be removed in the future. | ||
| </td> | ||
| <td>3.0.0</td> |
There was a problem hiding this comment.
SPARK-26288, commit ID: 8b0aa59#diff-6bdad48cfc34314e89599655442ff210
| all files/subdirectories of a stopped and timeout application. | ||
| This only affects Standalone mode, support of other cluster manangers can be added in the future. | ||
| </td> | ||
| <td>2.4.0</td> |
There was a problem hiding this comment.
SPARK-24340, commit ID: 8ef167a#diff-916ca56b663f178f302c265b7ef38499
| <tr> | ||
| <td><code>spark.deploy.recoveryMode</code></td> | ||
| <td>Set to FILESYSTEM to enable single-node recovery mode (default: NONE).</td> | ||
| <td>0.8.1</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668
| <tr> | ||
| <td><code>spark.deploy.recoveryDirectory</code></td> | ||
| <td>The directory in which Spark will store recovery state, accessible from the Master's perspective.</td> | ||
| <td>0.8.1</td> |
There was a problem hiding this comment.
No JIRA ID, commit ID: d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668
| If it is set to true, the data source provider <code>com.databricks.spark.avro</code> is mapped | ||
| to the built-in but external Avro data source module for backward compatibility. | ||
| </td> | ||
| <td>2.4.0</td> |
There was a problem hiding this comment.
SPARK-25129, commit ID: ac0174e#diff-9a6b543db706f1a90f790783d6930a13
| Compression codec used in writing of AVRO files. Supported codecs: uncompressed, deflate, | ||
| snappy, bzip2 and xz. Default codec is snappy. | ||
| </td> | ||
| <td>2.4.0</td> |
There was a problem hiding this comment.
SPARK-24881, commit ID: 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13
| the range of from 1 to 9 inclusive or -1. The default value is -1 which corresponds to 6 level | ||
| in the current implementation. | ||
| </td> | ||
| <td>2.4.0</td> |
There was a problem hiding this comment.
SPARK-24881, commit ID: 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13
| <code>native</code> means the native ORC support. <code>hive</code> means the ORC library | ||
| in Hive. | ||
| </td> | ||
| <td>2.3.0</td> |
There was a problem hiding this comment.
SPARK-20728, commit ID: 326f1d6#diff-9a6b543db706f1a90f790783d6930a13
| a new non-vectorized ORC reader is used in <code>native</code> implementation. | ||
| For <code>hive</code> implementation, this is ignored. | ||
| </td> | ||
| <td>2.3.0</td> |
There was a problem hiding this comment.
SPARK-16060, commit ID: 60f6b99#diff-9a6b543db706f1a90f790783d6930a13
| not differentiate between binary data and strings when writing out the Parquet schema. This | ||
| flag tells Spark SQL to interpret binary data as a string to provide compatibility with these systems. | ||
| </td> | ||
| <td>1.1.1</td> |
There was a problem hiding this comment.
SPARK-2927, commit ID: de501e1#diff-41ef65b9ef5b518f77e2a03559893f4d
| Some Parquet-producing systems, in particular Impala and Hive, store Timestamp into INT96. This | ||
| flag tells Spark SQL to interpret INT96 data as a timestamp to provide compatibility with these systems. | ||
| </td> | ||
| <td>1.3.0</td> |
There was a problem hiding this comment.
SPARK-4987, commit ID: 67d5220#diff-41ef65b9ef5b518f77e2a03559893f4d
| Note that <code>zstd</code> requires <code>ZStandardCodec</code> to be installed before Hadoop 2.9.0, <code>brotli</code> requires | ||
| <code>BrotliCodec</code> to be installed. | ||
| </td> | ||
| <td>1.1.1</td> |
There was a problem hiding this comment.
SPARK-3131, commit ID: 3a9d874#diff-41ef65b9ef5b518f77e2a03559893f4d
| <td><code>spark.sql.parquet.filterPushdown</code></td> | ||
| <td>true</td> | ||
| <td>Enables Parquet filter push-down optimization when set to true.</td> | ||
| <td>1.2.0</td> |
There was a problem hiding this comment.
SPARK-4391, commit ID: 576688a#diff-41ef65b9ef5b518f77e2a03559893f4d
| When set to false, Spark SQL will use the Hive SerDe for parquet tables instead of the built in | ||
| support. | ||
| </td> | ||
| <td>1.1.1</td> |
There was a problem hiding this comment.
SPARK-2406, commit ID: cc4015d#diff-ff50aea397a607b79df9bec6f2a841db
| schema is picked from the summary file or a random data file if no summary file is available. | ||
| </p> | ||
| </td> | ||
| <td>1.5.0</td> |
There was a problem hiding this comment.
SPARK-8690, commit ID: 246265f#diff-41ef65b9ef5b518f77e2a03559893f4d
| example, decimals will be written in int-based format. If Parquet output is intended for use | ||
| with systems that do not support this newer format, set to true. | ||
| </td> | ||
| <td>1.6.0</td> |
There was a problem hiding this comment.
SPARK-10400, commit ID: 01cd688#diff-41ef65b9ef5b518f77e2a03559893f4d
|
Test build #120540 has finished for PR 28064 at commit
|
HyukjinKwon
left a comment
There was a problem hiding this comment.
Looks good. I will merge in few days if there are no comments.
|
Merged to master. |
|
@HyukjinKwon Thanks for all your help. |
### What changes were proposed in this pull request? This PR supplements version for configuration appear in docs. I sorted out some information show below. **docs/spark-standalone.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.deploy.retainedApplications | 0.8.0 | None | 46eecd1#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.retainedDrivers | 1.1.0 | None | 7446f5f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.spreadOut | 0.6.1 | None | bb2b9ff#diff-0e7ae91819fc8f7b47b0f97be7116325 | spark.deploy.defaultCores | 0.9.0 | None | d8bcc8e#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.maxExecutorRetries | 1.6.3 | SPARK-16956 | ace458f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.worker.resource.{resourceName}.amount | 3.0.0 | SPARK-27371 | cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8 | spark.worker.resource.{resourceName}.discoveryScript | 3.0.0 | SPARK-27371 | cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8 | spark.worker.resourcesFile | 3.0.0 | SPARK-27369 | 7cbe01e#diff-b2fc8d6ab7ac5735085e2d6cfacb95da | spark.shuffle.service.db.enabled | 3.0.0 | SPARK-26288 | 8b0aa59#diff-6bdad48cfc34314e89599655442ff210 | spark.storage.cleanupFilesAfterExecutorExit | 2.4.0 | SPARK-24340 | 8ef167a#diff-916ca56b663f178f302c265b7ef38499 | spark.deploy.recoveryMode | 0.8.1 | None | d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.recoveryDirectory | 0.8.1 | None | d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668 | **docs/sql-data-sources-avro.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.legacy.replaceDatabricksSparkAvro.enabled | 2.4.0 | SPARK-25129 | ac0174e#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.avro.compression.codec | 2.4.0 | SPARK-24881 | 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.avro.deflate.level | 2.4.0 | SPARK-24881 | 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13 | **docs/sql-data-sources-orc.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.orc.impl | 2.3.0 | SPARK-20728 | 326f1d6#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.orc.enableVectorizedReader | 2.3.0 | SPARK-16060 | 60f6b99#diff-9a6b543db706f1a90f790783d6930a13 | **docs/sql-data-sources-parquet.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.parquet.binaryAsString | 1.1.1 | SPARK-2927 | de501e1#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.int96AsTimestamp | 1.3.0 | SPARK-4987 | 67d5220#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.compression.codec | 1.1.1 | SPARK-3131 | 3a9d874#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.filterPushdown | 1.2.0 | SPARK-4391 | 576688a#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.hive.convertMetastoreParquet | 1.1.1 | SPARK-2406 | cc4015d#diff-ff50aea397a607b79df9bec6f2a841db | spark.sql.parquet.mergeSchema | 1.5.0 | SPARK-8690 | 246265f#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.writeLegacyFormat | 1.6.0 | SPARK-10400 | 01cd688#diff-41ef65b9ef5b518f77e2a03559893f4d | ### Why are the changes needed? Supplemental configuration version information. ### Does this PR introduce any user-facing change? 'No'. ### How was this patch tested? Jenkins test Closes #28064 from beliefer/supplement-doc-for-data-sources. Authored-by: beliefer <beliefer@163.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
|
Merged to branch-3.0 too. |
### What changes were proposed in this pull request? This PR supplements version for configuration appear in docs. I sorted out some information show below. **docs/spark-standalone.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.deploy.retainedApplications | 0.8.0 | None | 46eecd1#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.retainedDrivers | 1.1.0 | None | 7446f5f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.spreadOut | 0.6.1 | None | bb2b9ff#diff-0e7ae91819fc8f7b47b0f97be7116325 | spark.deploy.defaultCores | 0.9.0 | None | d8bcc8e#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.maxExecutorRetries | 1.6.3 | SPARK-16956 | ace458f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.worker.resource.{resourceName}.amount | 3.0.0 | SPARK-27371 | cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8 | spark.worker.resource.{resourceName}.discoveryScript | 3.0.0 | SPARK-27371 | cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8 | spark.worker.resourcesFile | 3.0.0 | SPARK-27369 | 7cbe01e#diff-b2fc8d6ab7ac5735085e2d6cfacb95da | spark.shuffle.service.db.enabled | 3.0.0 | SPARK-26288 | 8b0aa59#diff-6bdad48cfc34314e89599655442ff210 | spark.storage.cleanupFilesAfterExecutorExit | 2.4.0 | SPARK-24340 | 8ef167a#diff-916ca56b663f178f302c265b7ef38499 | spark.deploy.recoveryMode | 0.8.1 | None | d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668 | spark.deploy.recoveryDirectory | 0.8.1 | None | d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668 | **docs/sql-data-sources-avro.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.legacy.replaceDatabricksSparkAvro.enabled | 2.4.0 | SPARK-25129 | ac0174e#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.avro.compression.codec | 2.4.0 | SPARK-24881 | 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.avro.deflate.level | 2.4.0 | SPARK-24881 | 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13 | **docs/sql-data-sources-orc.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.orc.impl | 2.3.0 | SPARK-20728 | 326f1d6#diff-9a6b543db706f1a90f790783d6930a13 | spark.sql.orc.enableVectorizedReader | 2.3.0 | SPARK-16060 | 60f6b99#diff-9a6b543db706f1a90f790783d6930a13 | **docs/sql-data-sources-parquet.md** Item name | Since version | JIRA ID | Commit ID | Note -- | -- | -- | -- | -- spark.sql.parquet.binaryAsString | 1.1.1 | SPARK-2927 | de501e1#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.int96AsTimestamp | 1.3.0 | SPARK-4987 | 67d5220#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.compression.codec | 1.1.1 | SPARK-3131 | 3a9d874#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.filterPushdown | 1.2.0 | SPARK-4391 | 576688a#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.hive.convertMetastoreParquet | 1.1.1 | SPARK-2406 | cc4015d#diff-ff50aea397a607b79df9bec6f2a841db | spark.sql.parquet.mergeSchema | 1.5.0 | SPARK-8690 | 246265f#diff-41ef65b9ef5b518f77e2a03559893f4d | spark.sql.parquet.writeLegacyFormat | 1.6.0 | SPARK-10400 | 01cd688#diff-41ef65b9ef5b518f77e2a03559893f4d | ### Why are the changes needed? Supplemental configuration version information. ### Does this PR introduce any user-facing change? 'No'. ### How was this patch tested? Jenkins test Closes apache#28064 from beliefer/supplement-doc-for-data-sources. Authored-by: beliefer <beliefer@163.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
What changes were proposed in this pull request?
This PR supplements version for configuration appear in docs.
I sorted out some information show below.
docs/spark-standalone.md
docs/sql-data-sources-avro.md
docs/sql-data-sources-orc.md
docs/sql-data-sources-parquet.md
Why are the changes needed?
Supplemental configuration version information.
Does this PR introduce any user-facing change?
'No'.
How was this patch tested?
Jenkins test