title | titleSuffix | description | author | ms.author | ms.reviewer | ms.date | ms.service | ms.subservice | ms.topic |
---|---|---|---|---|---|---|---|---|---|
Machine Learning Services (Python, R) |
SQL Server Big Data Clusters |
Learn how you can run Python and R scripts on the master instance of a SQL Server 2019 Big Data Clusters with Machine Learning Services. |
WilliamDAssafMSFT |
wiassaf |
wiassaf |
09/09/2022 |
sql |
machine-learning-bdc |
conceptual |
[!INCLUDESQL Server 2019]
[!INCLUDEbig-data-clusters-banner-retirement]
You can run Python and R scripts on the master instance of SQL Server Big Data Clusters with Machine Learning Services.
Note
You can also run Java code on the master instance of SQL Server Big Data Clusters with the Java Language Extension. Following the steps below will also enable SQL Server Language Extensions.
Machine Learning Services is installed by default on SQL Server 2019 Big Data Clusters and does not require separate installation.
To enable Machine Learning Services, run this statement on the master instance:
EXEC sp_configure 'external scripts enabled', 1
RECONFIGURE WITH OVERRIDE
GO
You are now ready to run Python and R scripts on the master instance of Big Data Clusters. See the quickstarts under Next steps to run your first script.
Note
The configuration setting cannot be set on an availability group listener connection. If Big Data Clusters is deployed with high availability, the set external scripts enabled
on each replica. See Enable on cluster with high availability.
When you Deploy SQL Server Big Data Cluster with high availability, the deployment creates an availability group for the master instance. To enable Machine Learning Services, set external scripts enabled
on each instance of the availability group. For a Big Data Cluster, you need to run sp_configure
on each replica of the SQL Server master instance
The following section describes how to enable external scripts on each instance.
For each replica on the availability group, create a load balancer to allow you to connect to the instance.
kubectl expose pod <pod-name> --port=<connection port number> --name=<load-balancer-name> --type=LoadBalancer -n <kubernetes namespace>
The examples in this article use the following values:
<pod-name>
:master-#
<connection port number>
:1533
<load-balancer-name>
:mymaster-#
<kubernetes namespace>
:mssql-cluster
Update the following script for your environment, and run the commands:
kubectl expose pod master-0 --port=1533 --name=mymaster-0 --type=LoadBalancer -n mssql-cluster
kubectl expose pod master-1 --port=1533 --name=mymaster-1 --type=LoadBalancer -n mssql-cluster
kubectl expose pod master-2 --port=1533 --name=mymaster-2 --type=LoadBalancer -n mssql-cluster
kubectl
returns the following output.
service/mymaster-0 exposed
service/mymaster-1 exposed
service/mymaster-2 exposed
Each load balancer is a master replica endpoint.
-
Get the IP address for the master replica endpoint.
The following command returns the external IP address for the replica endpoint.
kubectl get services <load-balancer-name> -n <kubernetes namespace>
To get the external IP address for each replica in this scenario, run the following commands:
kubectl get services mymaster-0 -n mssql-cluster kubectl get services mymaster-1 -n mssql-cluster kubectl get services mymaster-2 -n mssql-cluster
[!NOTE]
It may take a little time before the external IP address is available. Run the preceding script periodically until each endpoint returns an external IP address. -
Connect to the master replica endpoint and enable script execution.
Run this statement:
EXEC sp_configure 'external scripts enabled', 1 RECONFIGURE WITH OVERRIDE GO
For example, you can run the preceding command with
sqlcmd
. The following example connects to the master replica endpoint and enables script execution. Update the values in the script with for your environment.sqlcmd -S <IP address>,1533 -U <user name> -P <password> -Q "EXEC sp_configure 'external scripts enabled', 1; RECONFIGURE WITH OVERRIDE;"
Repeat the step for each replica.
The following image demonstrates this process.
:::image type="content" source="media/machine-learning-services/example-kube-enable-scripts.png" alt-text="A screenshot of the command prompt providing a demo of the steps necessary to enable external scripts." lightbox="media/machine-learning-services/example-kube-enable-scripts.png" :::
You are now ready to run Python and R scripts on the master instance of Big Data Clusters. See the quickstarts under Next steps to run your first script.
On the Kubernetes cluster, delete the endpoint for each replica. The endpoint is exposed in Kubernetes as a load-balancing service.
The following command deletes load-balancing service.
kubectl delete svc <load-balancer-name> -n mssql-cluster
For the examples in this article, run the following commands.
kubectl delete svc mymaster-0 -n mssql-cluster
kubectl delete svc mymaster-1 -n mssql-cluster
kubectl delete svc mymaster-2 -n mssql-cluster