title | titleSuffix | description | author | ms.author | ms.reviewer | ms.date | ms.service | ms.subservice | ms.topic |
---|---|---|---|---|---|---|---|---|---|
SQL Server Big Data Clusters Configuration Pre CU9 |
SQL Server Big Data Clusters |
Big Data Clusters Configuration Pre CU9 |
WilliamDAssafMSFT |
wiassaf |
rahul.ajmera |
02/11/2021 |
sql |
big-data-cluster |
reference |
[!INCLUDESQL Server 2019]
[!INCLUDEbig-data-clusters-banner-retirement]
In SQL Server 2019 Big Data Clusters releases CU8 and earlier, you can configure big data cluster settings at deployment time through the deployment bdc.json
file. The SQL Server master instance can be configured post-deployment only using mssql-conf.
Note
Prior to the CU9 release and support for configuration-enabled clusters, Big Data Clusters could be configured at deployment time only, with exception to the SQL Server master instance - which could be configured post-deployment only using mssql-conf. For instructions to configure a CU9 and later release, see Configure a SQL Server Big Data Cluster.
Big Data Clusters configuration pre-CU9 has two scoping levels: service
, and resource
. The hierarchy of the settings follows in this order as well, from highest to lowest. BDC components will take the value of the setting defined at the lowest scope. If the setting is not defined at a given scope, it will inherit the value from its higher parent scope.
For example, you may want to define the default number of cores the Spark driver will use in the storage pool and Sparkhead
resources. You can do this in two ways:
- Specify a default cores value at the
Spark
service scope - Specify a default cores value at the
storage-0
andsparkhead
resource scope
In the first scenario, all lower-scoped resources of the Spark service (storage pool and Sparkhead
) will inherit the default number of cores from the Spark service default value.
In the second scenario, each resource will use the value defined at its respective scope.
If the default number of cores is configured at both service and resource scope, then the resource-scoped value will override the service-scoped value since this is the lowest user configured scope for the given setting.
For specific information about configuration, see the appropriate articles:
Configure master instance of [!INCLUDEbig-data-clusters-2019].
Server configuration settings cannot be configured for SQL Server master instance at deployment time. This article describes a temporary workaround on how to configure settings like SQL Server edition, enable or disable SQL Server Agent, enable specific trace flags or enable/disable customer feedback.
To change any of these settings, follow these steps:
-
Create a custom
mssql-custom.conf
file that includes targeted settings. The following example enables SQL Agent, telemetry, sets a PID for Enterprise Edition, and enables trace flag 1204.:[sqlagent] enabled=true [telemetry] customerfeedback=true userRequestedLocalAuditDirectory = /tmp/audit [DEFAULT] pid = Enterprise [traceflag] traceflag0 = 1204
-
Copy
mssql-custom.conf
file to/var/opt/mssql
in themssql-server
container in themaster-0
pod. Replace<namespaceName>
with the big data cluster name.kubectl cp mssql-custom.conf master-0:/var/opt/mssql/mssql-custom.conf -c mssql-server -n <namespaceName>
-
Restart SQL Server instance. Replace
<namespaceName>
with the big data cluster name.kubectl exec -it master-0 -c mssql-server -n <namespaceName> -- /bin/bash supervisorctl restart mssql-server exit
Important
If SQL Server master instance is in an availability groups configuration, copy the mssql-custom.conf
file in all the master
pods. Note that each restart will cause a failover, so you must make sure you are timing this activity during downtime periods.
- The steps above require Kubernetes cluster admin permissions
- You cannot change the server collation for SQL Server master instance of the big data cluster post deployment.
In order to configure Apache Spark and Apache Hadoop in Big Data Clusters, you need to modify the cluster profile at deployment time.
A Big Data Cluster has four configuration categories:
sql
hdfs
spark
gateway
sql
, hdfs
, spark
, sql
are services. Each service maps to the same named configuration category. All gateway configurations go to category gateway
.
For example, all configurations in service hdfs
belong to category hdfs
. Note that all Hadoop (core-site), HDFS and Zookeeper configurations belong to category hdfs
; all Livy, Spark, Yarn, Hive, Metastore configurations belong to category spark
.
Supported configurations lists Apache Spark & Hadoop properties that you can configure when you deploy a SQL Server Big Data Cluster.
The following sections list properties that you can't modify in a cluster: