[SPARK-31295][DOC] Supplement version for configuration appear in doc by beliefer · Pull Request #28064 · apache/spark

beliefer · 2020-03-29T01:35:46Z

What changes were proposed in this pull request?

This PR supplements version for configuration appear in docs.
I sorted out some information show below.

docs/spark-standalone.md

Item name	Since version	JIRA ID	Commit ID
spark.deploy.retainedApplications	0.8.0	None	`46eecd1`#diff-29dffdccd5a7f4c8b496c293e87c8668
spark.deploy.retainedDrivers	1.1.0	None	`7446f5f`#diff-29dffdccd5a7f4c8b496c293e87c8668
spark.deploy.spreadOut	0.6.1	None	`bb2b9ff`#diff-0e7ae91819fc8f7b47b0f97be7116325
spark.deploy.defaultCores	0.9.0	None	`d8bcc8e`#diff-29dffdccd5a7f4c8b496c293e87c8668
spark.deploy.maxExecutorRetries	1.6.3	SPARK-16956	`ace458f`#diff-29dffdccd5a7f4c8b496c293e87c8668
spark.worker.resource.{resourceName}.amount	3.0.0	SPARK-27371	`cbad616`#diff-d25032e4a3ae1b85a59e4ca9ccf189a8
spark.worker.resource.{resourceName}.discoveryScript	3.0.0	SPARK-27371	`cbad616`#diff-d25032e4a3ae1b85a59e4ca9ccf189a8
spark.worker.resourcesFile	3.0.0	SPARK-27369	`7cbe01e`#diff-b2fc8d6ab7ac5735085e2d6cfacb95da
spark.shuffle.service.db.enabled	3.0.0	SPARK-26288	`8b0aa59`#diff-6bdad48cfc34314e89599655442ff210
spark.storage.cleanupFilesAfterExecutorExit	2.4.0	SPARK-24340	`8ef167a`#diff-916ca56b663f178f302c265b7ef38499
spark.deploy.recoveryMode	0.8.1	None	`d66c01f`#diff-29dffdccd5a7f4c8b496c293e87c8668
spark.deploy.recoveryDirectory	0.8.1	None	`d66c01f`#diff-29dffdccd5a7f4c8b496c293e87c8668

docs/sql-data-sources-avro.md

Item name	Since version	JIRA ID	Commit ID
spark.sql.legacy.replaceDatabricksSparkAvro.enabled	2.4.0	SPARK-25129	`ac0174e`#diff-9a6b543db706f1a90f790783d6930a13
spark.sql.avro.compression.codec	2.4.0	SPARK-24881	`0a0f68b`#diff-9a6b543db706f1a90f790783d6930a13
spark.sql.avro.deflate.level	2.4.0	SPARK-24881	`0a0f68b`#diff-9a6b543db706f1a90f790783d6930a13

docs/sql-data-sources-orc.md

Item name	Since version	JIRA ID	Commit ID	Note
spark.sql.orc.impl	2.3.0	SPARK-20728	`326f1d6`#diff-9a6b543db706f1a90f790783d6930a13
spark.sql.orc.enableVectorizedReader	2.3.0	SPARK-16060	`60f6b99`#diff-9a6b543db706f1a90f790783d6930a13

docs/sql-data-sources-parquet.md

Item name	Since version	JIRA ID	Commit ID
spark.sql.parquet.binaryAsString	1.1.1	SPARK-2927	`de501e1`#diff-41ef65b9ef5b518f77e2a03559893f4d
spark.sql.parquet.int96AsTimestamp	1.3.0	SPARK-4987	`67d5220`#diff-41ef65b9ef5b518f77e2a03559893f4d
spark.sql.parquet.compression.codec	1.1.1	SPARK-3131	`3a9d874`#diff-41ef65b9ef5b518f77e2a03559893f4d
spark.sql.parquet.filterPushdown	1.2.0	SPARK-4391	`576688a`#diff-41ef65b9ef5b518f77e2a03559893f4d
spark.sql.hive.convertMetastoreParquet	1.1.1	SPARK-2406	`cc4015d`#diff-ff50aea397a607b79df9bec6f2a841db
spark.sql.parquet.mergeSchema	1.5.0	SPARK-8690	`246265f`#diff-41ef65b9ef5b518f77e2a03559893f4d
spark.sql.parquet.writeLegacyFormat	1.6.0	SPARK-10400	`01cd688`#diff-41ef65b9ef5b518f77e2a03559893f4d

Why are the changes needed?

Supplemental configuration version information.

Does this PR introduce any user-facing change?

'No'.

How was this patch tested?

Jenkins test

beliefer · 2020-03-29T01:36:24Z

docs/spark-standalone.md

  <td>
    The maximum number of completed applications to display. Older applications will be dropped from the UI to maintain this limit.<br/>
  </td>
+  <td>0.8.0</td>


No JIRA ID, commit ID: 46eecd1#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:36:47Z

docs/spark-standalone.md

  <td>
   The maximum number of completed drivers to display. Older drivers will be dropped from the UI to maintain this limit.<br/>
  </td>
+  <td>1.1.0</td>


No JIRA ID, commit ID: 7446f5f#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:37:08Z

docs/spark-standalone.md

    to consolidate them onto as few nodes as possible. Spreading out is usually better for
    data locality in HDFS, but consolidating is more efficient for compute-intensive workloads. <br/>
  </td>
+  <td>0.6.1</td>


No JIRA ID, commit ID: bb2b9ff#diff-0e7ae91819fc8f7b47b0f97be7116325

beliefer · 2020-03-29T01:37:31Z

docs/spark-standalone.md

    Set this lower on a shared cluster to prevent users from grabbing
    the whole cluster by default. <br/>
  </td>
+  <td>0.9.0</td>


No JIRA ID, commit ID: d8bcc8e#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:37:54Z

docs/spark-standalone.md

    <code>-1</code>.
    <br/>
  </td>
+  <td>1.6.3</td>


SPARK-16956, commit ID: ace458f#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:38:32Z

docs/spark-standalone.md

  <td>
    Amount of a particular resource to use on the worker.
  </td>
+  <td>3.0.0</td>


SPARK-27371, commit ID: cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8

beliefer · 2020-03-29T01:38:41Z

docs/spark-standalone.md

    Path to resource discovery script, which is used to find a particular resource while worker starting up.
    And the output of the script should be formatted like the <code>ResourceInformation</code> class.
  </td>
+  <td>3.0.0</td>


SPARK-27371, commit ID: cbad616#diff-d25032e4a3ae1b85a59e4ca9ccf189a8

beliefer · 2020-03-29T01:39:15Z

docs/spark-standalone.md

    enabled).  You should also enable <code>spark.worker.cleanup.enabled</code>, to ensure that the state
    eventually gets cleaned up.  This config may be removed in the future.
  </td>
+  <td>3.0.0</td>


SPARK-26288, commit ID: 8b0aa59#diff-6bdad48cfc34314e89599655442ff210

beliefer · 2020-03-29T01:39:39Z

docs/spark-standalone.md

    all files/subdirectories of a stopped and timeout application.
    This only affects Standalone mode, support of other cluster manangers can be added in the future.
  </td>
+  <td>2.4.0</td>


SPARK-24340, commit ID: 8ef167a#diff-916ca56b663f178f302c265b7ef38499

beliefer · 2020-03-29T01:40:04Z

docs/spark-standalone.md

  <tr>
    <td><code>spark.deploy.recoveryMode</code></td>
    <td>Set to FILESYSTEM to enable single-node recovery mode (default: NONE).</td>
+    <td>0.8.1</td>


No JIRA ID, commit ID: d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:40:29Z

docs/spark-standalone.md

  <tr>
    <td><code>spark.deploy.recoveryDirectory</code></td>
    <td>The directory in which Spark will store recovery state, accessible from the Master's perspective.</td>
+    <td>0.8.1</td>


No JIRA ID, commit ID: d66c01f#diff-29dffdccd5a7f4c8b496c293e87c8668

beliefer · 2020-03-29T01:41:00Z

docs/sql-data-sources-avro.md

+      If it is set to true, the data source provider <code>com.databricks.spark.avro</code> is mapped
+      to the built-in but external Avro data source module for backward compatibility.
+    </td>
+    <td>2.4.0</td>


SPARK-25129, commit ID: ac0174e#diff-9a6b543db706f1a90f790783d6930a13

beliefer · 2020-03-29T01:41:30Z

docs/sql-data-sources-avro.md

+      Compression codec used in writing of AVRO files. Supported codecs: uncompressed, deflate,
+      snappy, bzip2 and xz. Default codec is snappy.
+    </td>
+    <td>2.4.0</td>


SPARK-24881, commit ID: 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13

beliefer · 2020-03-29T01:41:47Z

docs/sql-data-sources-avro.md

+      the range of from 1 to 9 inclusive or -1. The default value is -1 which corresponds to 6 level
+      in the current implementation.
+    </td>
+    <td>2.4.0</td>


SPARK-24881, commit ID: 0a0f68b#diff-9a6b543db706f1a90f790783d6930a13

beliefer · 2020-03-29T01:42:26Z

docs/sql-data-sources-orc.md

+      <code>native</code> means the native ORC support. <code>hive</code> means the ORC library
+      in Hive.
+    </td>
+    <td>2.3.0</td>


SPARK-20728, commit ID: 326f1d6#diff-9a6b543db706f1a90f790783d6930a13

beliefer · 2020-03-29T01:42:56Z

docs/sql-data-sources-orc.md

+      a new non-vectorized ORC reader is used in <code>native</code> implementation.
+      For <code>hive</code> implementation, this is ignored.
+    </td>
+    <td>2.3.0</td>


SPARK-16060, commit ID: 60f6b99#diff-9a6b543db706f1a90f790783d6930a13

beliefer · 2020-03-29T01:43:22Z