Skip to content

Latest commit

 

History

History
157 lines (103 loc) · 6.49 KB

configure-bdc-postdeployment.md

File metadata and controls

157 lines (103 loc) · 6.49 KB
title titleSuffix description author ms.author ms.reviewer ms.date ms.service ms.subservice ms.topic
SQL Server Big Data Clusters post-deployment configuration overview
SQL Server Big Data Clusters
Big data clusters post-deployment configuration overview
HugoMSFT
hudequei
wiassaf
10/05/2021
sql
big-data-cluster
reference

How to configure big data clusters settings post deployment

[!INCLUDESQL Server 2019]

[!INCLUDEbig-data-clusters-banner-retirement]

Cluster, service, and resource scoped settings for [!INCLUDEbig-data-clusters-nover] can be configured post-deployment through the azdata CLI. This functionality allows [!INCLUDEbig-data-clusters-nover] administrators to adjust configurations to always meet workload requirements. This article goes over example scenarios on how to configure timezone and Spark workload requirements. The post-deployment configuration functionality follows a set, diff, apply flow.

Note

Post-deployment settings configuration is only available in [!INCLUDEbig-data-clusters-nover] CU9 and later deployments. Settings configuration does not include scale, storage, or endpoint configuration. Options and instructions to configure [!INCLUDEbig-data-clusters-nover] prior to CU9 can be found here.

Step by Step Scenario: Configure timezone on [!INCLUDEbig-data-clusters-nover]

Starting on [!INCLUDEbig-data-clusters-nover] CU13 it is possible to customize the cluster timezone configuration, so services timestamps align with the selected timezone. The setting does not apply to the big data cluster control plane, it sets the new timezone configuration for all SQL Server pools (master, compute, and data), Hadoop components, and Spark.

Note

By default, [!INCLUDEbig-data-clusters-nover] sets UTC as the timezone.

Use the following command to set the timezone configuration:

azdata bdc settings set --settings bdc.timezone=America/Los_Angeles

Apply the pending settings to the cluster

The following command will apply the configuration and restart all services. Review the last sections of this article on how to track changes and control the configuration process.

azdata bdc settings apply

Step by Step Scenario: Configure the cluster to meet your Spark workload requirements

View the current configurations of the big data cluster Spark service

The following example shows how to view the user configured settings of the Spark service. You can view all possible configurable settings, system-managed and all configurable settings, and pending settings through optional parameters. Visit azdata bdc spark statement for more information.

azdata bdc spark settings show

Sample output

Spark Service

Setting Running Value
spark-defaults-conf.spark.driver.cores 1
spark-defaults-conf.spark.driver.memory 1664m

Change the default number of cores and memory for the Spark driver

Update the default number of cores to two and default memory to 7424 MB for the Spark service. This affects all resources with Spark, for the Spark service.

azdata bdc spark settings set --settings spark-defaults-conf.spark.driver.cores=2,spark-defaults-conf.spark.driver.memory=7424m

Change the default number of cores and memory for the Spark executors in the Storage Pool

Update the default number of executor cores to 4 for the Storage Pool.

azdata bdc spark settings set --settings spark-defaults-conf.spark.executor.cores=4 --resource=storage-0

Configure additional paths to the default classpath of Spark applications

The /opt/hadoop/share/hadoop/tools/lib/ path contains several libraries to be used by your spark applications, but the referred path is not loaded by default in the classpath of Spark applications. To enable this setting, apply the following configuration pattern.

azdata bdc hdfs settings set --settings hadoop-env.HADOOP_CLASSPATH="/opt/hadoop/share/hadoop/tools/lib/*"

View the pending settings changes staged in the big data cluster

View the pending settings changes for the Spark service only and across the entire big data cluster.

Pending Spark Service Settings

azdata bdc spark settings show --filter-option=pending --include-details

Spark Service

Setting Running Value Configured Value Configurable Configured Last Updated Time
spark-defaults-conf.spark.driver.cores 1 2 true true
spark-defaults-conf.spark.driver.memory 1664m 7424m true true

All Pending Settings

azdata bdc settings show --filter-option=pending --include-details --recursive

Spark Service Settings - Pending

Setting Running Value Configured Value Configurable Configured Last Updated Time
spark-defaults-conf.spark.driver.cores 1 2 true true
spark-defaults-conf.spark.driver.memory 1664m 7424m true true

Storage-0 Resource Spark Settings - Pending

Setting Running Value Configured Value Configurable Configured Last Updated Time
spark-defaults-conf.spark.executor.cores 1 4 true true

Apply the pending settings to the big data cluster

azdata bdc settings apply

Monitor the configuration update status

azdata bdc status show

Optional steps

Revert pending configuration settings

If you determine that you no longer want to change the pending configuration settings, you can un-stage these settings. This will revert the pending settings at all scopes.

azdata bdc settings revert

Abort the configuration upgrade

If the configuration upgrade fails for any of the components, you can cancel the upgrade process and return the cluster back to its prior configurations. Settings that were staged for change during the upgrade will again be listed as pending settings.

azdata bdc settings cancel-apply

Next steps

Configure a SQL Server Big Data Cluster