title	titleSuffix	description	author	ms.author	ms.reviewer	ms.date	ms.service	ms.subservice	ms.topic
SQL Server Big Data Clusters post-deployment configuration overview	SQL Server Big Data Clusters	Big data clusters post-deployment configuration overview	HugoMSFT	hudequei	wiassaf	10/05/2021	sql	big-data-cluster	reference

How to configure big data clusters settings post deployment

[!INCLUDESQL Server 2019]

[!INCLUDEbig-data-clusters-banner-retirement]

Cluster, service, and resource scoped settings for [!INCLUDEbig-data-clusters-nover] can be configured post-deployment through the azdata CLI. This functionality allows [!INCLUDEbig-data-clusters-nover] administrators to adjust configurations to always meet workload requirements. This article goes over example scenarios on how to configure timezone and Spark workload requirements. The post-deployment configuration functionality follows a set, diff, apply flow.

Note

Post-deployment settings configuration is only available in [!INCLUDEbig-data-clusters-nover] CU9 and later deployments. Settings configuration does not include scale, storage, or endpoint configuration. Options and instructions to configure [!INCLUDEbig-data-clusters-nover] prior to CU9 can be found here.

Step by Step Scenario: Configure timezone on [!INCLUDEbig-data-clusters-nover]

Starting on [!INCLUDEbig-data-clusters-nover] CU13 it is possible to customize the cluster timezone configuration, so services timestamps align with the selected timezone. The setting does not apply to the big data cluster control plane, it sets the new timezone configuration for all SQL Server pools (master, compute, and data), Hadoop components, and Spark.

Note

By default, [!INCLUDEbig-data-clusters-nover] sets UTC as the timezone.

Use the following command to set the timezone configuration:

azdata bdc settings set --settings bdc.timezone=America/Los_Angeles

Apply the pending settings to the cluster

The following command will apply the configuration and restart all services. Review the last sections of this article on how to track changes and control the configuration process.

azdata bdc settings apply

Step by Step Scenario: Configure the cluster to meet your Spark workload requirements

View the current configurations of the big data cluster Spark service

The following example shows how to view the user configured settings of the Spark service. You can view all possible configurable settings, system-managed and all configurable settings, and pending settings through optional parameters. Visit azdata bdc spark statement for more information.

azdata bdc spark settings show

Sample output

Spark Service

Setting	Running Value
`spark-defaults-conf.spark.driver.cores`	`1`
`spark-defaults-conf.spark.driver.memory`	`1664m`

Change the default number of cores and memory for the Spark driver

Update the default number of cores to two and default memory to 7424 MB for the Spark service. This affects all resources with Spark, for the Spark service.

azdata bdc spark settings set --settings spark-defaults-conf.spark.driver.cores=2,spark-defaults-conf.spark.driver.memory=7424m

Change the default number of cores and memory for the Spark executors in the Storage Pool

Update the default number of executor cores to 4 for the Storage Pool.

azdata bdc spark settings set --settings spark-defaults-conf.spark.executor.cores=4 --resource=storage-0

Configure additional paths to the default classpath of Spark applications

The /opt/hadoop/share/hadoop/tools/lib/ path contains several libraries to be used by your spark applications, but the referred path is not loaded by default in the classpath of Spark applications. To enable this setting, apply the following configuration pattern.

azdata bdc hdfs settings set --settings hadoop-env.HADOOP_CLASSPATH="/opt/hadoop/share/hadoop/tools/lib/*"

View the pending settings changes staged in the big data cluster

View the pending settings changes for the Spark service only and across the entire big data cluster.

Pending Spark Service Settings

azdata bdc spark settings show --filter-option=pending --include-details

Spark Service

Setting	Running Value	Configured Value	Configurable	Configured	Last Updated Time
`spark-defaults-conf.spark.driver.cores`	`1`	`2`	`true`	`true`
`spark-defaults-conf.spark.driver.memory`	`1664m`	`7424m`	`true`	`true`

All Pending Settings

azdata bdc settings show --filter-option=pending --include-details --recursive

Spark Service Settings - Pending

Setting	Running Value	Configured Value	Configurable	Configured	Last Updated Time
`spark-defaults-conf.spark.driver.cores`	`1`	`2`	`true`	`true`
`spark-defaults-conf.spark.driver.memory`	`1664m`	`7424m`	`true`	`true`

Storage-0 Resource Spark Settings - Pending

Setting	Running Value	Configured Value	Configurable	Configured	Last Updated Time
`spark-defaults-conf.spark.executor.cores`	`1`	`4`	`true`	`true`

Apply the pending settings to the big data cluster

azdata bdc settings apply

Monitor the configuration update status

azdata bdc status show

Optional steps

Revert pending configuration settings

If you determine that you no longer want to change the pending configuration settings, you can un-stage these settings. This will revert the pending settings at all scopes.

azdata bdc settings revert

Abort the configuration upgrade

If the configuration upgrade fails for any of the components, you can cancel the upgrade process and return the cluster back to its prior configurations. Settings that were staged for change during the upgrade will again be listed as pending settings.

azdata bdc settings cancel-apply

Next steps

Configure a SQL Server Big Data Cluster

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configure-bdc-postdeployment.md

configure-bdc-postdeployment.md

How to configure big data clusters settings post deployment

Step by Step Scenario: Configure timezone on [!INCLUDEbig-data-clusters-nover]

Apply the pending settings to the cluster

Step by Step Scenario: Configure the cluster to meet your Spark workload requirements

View the current configurations of the big data cluster Spark service

Sample output

Change the default number of cores and memory for the Spark driver

Change the default number of cores and memory for the Spark executors in the Storage Pool

Configure additional paths to the default classpath of Spark applications

View the pending settings changes staged in the big data cluster

Pending Spark Service Settings

Spark Service

All Pending Settings

Apply the pending settings to the big data cluster

Monitor the configuration update status

Optional steps

Revert pending configuration settings

Abort the configuration upgrade

Next steps

Files

configure-bdc-postdeployment.md

Latest commit

History

configure-bdc-postdeployment.md

File metadata and controls

How to configure big data clusters settings post deployment

Step by Step Scenario: Configure timezone on [!INCLUDEbig-data-clusters-nover]

Apply the pending settings to the cluster

Step by Step Scenario: Configure the cluster to meet your Spark workload requirements

View the current configurations of the big data cluster Spark service

Sample output

Change the default number of cores and memory for the Spark driver

Change the default number of cores and memory for the Spark executors in the Storage Pool

Configure additional paths to the default classpath of Spark applications

View the pending settings changes staged in the big data cluster

Pending Spark Service Settings

Spark Service

All Pending Settings

Apply the pending settings to the big data cluster

Monitor the configuration update status

Optional steps

Revert pending configuration settings

Abort the configuration upgrade

Next steps