![Microsoft](https://raw.githubusercontent.com/microsoft/azuredatastudio/master/src/sql/media/microsoft-small-logo.png)
 
## Create Azure Kubernetes Service cluster and deploy SQL Server 2019 big data cluster
 
This notebook walks through the process of creating a new Azure Kubernetes Service cluster first, and then deploys a <a href="https://docs.microsoft.com/sql/big-data-cluster/big-data-cluster-overview?view=sqlallproducts-allversions">SQL Server 2019 big data cluster</a> on the newly created AKS cluster.
 
* Follow the instructions in the **Prerequisites** cell to install the tools if not already installed.
* The **Required information** cell will prompt you for a password that will be used to access the cluster controller, SQL Server, and Knox.
* The values in the **Azure settings** and **Default settings** cell can be changed as appropriate.

<span style="color:red"><font size="3">Please press the "Run Cells" button to run the notebook</font></span>

### **Prerequisites**
Ensure the following tools are installed and added to PATH before proceeding.

|Tools|Description|Installation|
|---|---|---|
|Azure CLI |Command-line tool for managing Azure services. Used to create AKS cluster | [Installation](https://docs.microsoft.com/cli/azure/install-azure-cli?view=azure-cli-latest) |
|kubectl | Command-line tool for monitoring the underlying Kuberentes cluster | [Installation](https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-binary-using-native-package-management) |
|azdata | Command-line tool for installing and managing a big data cluster |[Installation](https://docs.microsoft.com/en-us/sql/big-data-cluster/deploy-install-azdata?view=sqlallproducts-allversions) |

### **Check dependencies**

In [1]:
import pandas,sys,os,getpass,time,json,html
pandas_version = pandas.__version__.split('.')
pandas_major = int(pandas_version[0])
pandas_minor = int(pandas_version[1])
pandas_patch = int(pandas_version[2])
if not (pandas_major > 0 or (pandas_major == 0 and pandas_minor > 24) or (pandas_major == 0 and pandas_minor == 24 and pandas_patch >= 2)):
    sys.exit('Please upgrade the Notebook dependency before you can proceed, you can do it by running the "Reinstall Notebook dependencies" command in command palette (View menu -> Command Palette…).')

def run_command():
    print("Executing: " + cmd)
    !{cmd}
    if _exit_code != 0:
        sys.exit(f'Command execution failed with exit code: {str(_exit_code)}.\n\t{cmd}\n')
    print(f'Successfully executed: {cmd}')

cmd = 'az --version'
run_command()
cmd = 'kubectl version --client=true'
run_command()
cmd = 'azdata --version'
run_command()

Executing: az --version


azure-cli                         2.0.70 *

command-modules-nspkg               2.0.3
core                              2.0.70 *
nspkg                              3.0.4
telemetry                          1.0.3

Python location 'C:\Program Files (x86)\Microsoft SDKs\Azure\CLI2\python.exe'
Extensions directory 'C:\Users\niels\.azure\cliextensions'

Python (Windows) 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 02:47:15) [MSC v.1900 32 bit (Intel)]

Legal docs and information: aka.ms/AzureCliLegal


Successfully executed: az --version
Executing: kubectl version --client=true




Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.3", GitCommit:"5e53fd6bc17c0dec8434817e69b04a25d8ae0ff0", GitTreeState:"clean", BuildDate:"2019-06-06T01:44:30Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"windows/amd64"}
Successfully executed: kubectl version --client=true
Executing: azdata --version


15.0.1900

Python (Windows) 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]

Python location 'c:\python37\python.exe'

Successfully executed: azdata --version


### **Required information**

In [2]:
env_var_flag = "AZDATA_NB_VAR_BDC_CONTROLLER_PASSWORD" in os.environ
if env_var_flag:
    mssql_password = os.environ["AZDATA_NB_VAR_BDC_CONTROLLER_PASSWORD"]
else: 
    mssql_password = getpass.getpass(prompt = 'SQL Server 2019 big data cluster controller password')
    if mssql_password == "":
        sys.exit(f'Password is required.')
    confirm_password = getpass.getpass(prompt = 'Confirm password')
    if mssql_password != confirm_password:
        sys.exit(f'Passwords do not match.')
print('You can also use the same password to access Knox and SQL Server.')

You can also use the same password to access Knox and SQL Server.


### **Azure settings**
*Subscription ID*: visit <a href="https://portal.azure.com/#blade/Microsoft_Azure_Billing/SubscriptionsBlade">here</a> to find out the subscriptions you can use, if you leave it unspecified, the default subscription will be used.

*VM Size*: visit <a href="https://docs.microsoft.com/en-us/azure/virtual-machines/linux/sizes">here</a> to find out the available VM sizes you could use. 
 
*Region*: visit <a href="https://azure.microsoft.com/en-us/global-infrastructure/services/?products=kubernetes-service">here</a> to find out the Azure regions where the Azure Kubernettes Service is available.

In [3]:
if env_var_flag:
    azure_subscription_id = os.environ["AZDATA_NB_VAR_BDC_AZURE_SUBSCRIPTION"]
    azure_vm_size = os.environ["AZDATA_NB_VAR_BDC_AZURE_VM_SIZE"]
    azure_region = os.environ["AZDATA_NB_VAR_BDC_AZURE_REGION"]
    azure_vm_count = int(os.environ["AZDATA_NB_VAR_BDC_VM_COUNT"])
else:
    azure_subscription_id = ""
    azure_vm_size = "Standard_E4s_v3"
    azure_region = "eastus"
    azure_vm_count = int(5)

### **Default settings**

In [4]:
if env_var_flag:
    mssql_cluster_name = os.environ["AZDATA_NB_VAR_BDC_NAME"]
    mssql_controller_username = os.environ["AZDATA_NB_VAR_BDC_CONTROLLER_USERNAME"]
    azure_resource_group = os.environ["AZDATA_NB_VAR_BDC_RESOURCEGROUP_NAME"]
    aks_cluster_name = os.environ["AZDATA_NB_VAR_BDC_AKS_NAME"]
else:
    mssql_cluster_name = 'mssql-cluster'
    mssql_controller_username = 'admin'
    azure_resource_group = mssql_cluster_name + '-' + time.strftime("%Y%m%d%H%M%S", time.localtime())
    aks_cluster_name = azure_resource_group
configuration_profile = 'aks-dev-test'
configuration_folder = 'mssql-bdc-configuration'
print(f'Azure subscription: {azure_subscription_id}')
print(f'Azure VM size: {azure_vm_size}')
print(f'Azure VM count: {str(azure_vm_count)}')
print(f'Azure region: {azure_region}')
print(f'Azure resource group: {azure_resource_group}')
print(f'AKS cluster name: {aks_cluster_name}')
print(f'SQL Server big data cluster name: {mssql_cluster_name}')
print(f'SQL Server big data cluster controller user name: {mssql_controller_username}')
print(f'Deployment configuration profile: {configuration_profile}')
print(f'Deployment configuration: {configuration_folder}')

Azure subscription: b7e65cb3-9829-44ea-9a90-6250cc442b3b
Azure VM size: Standard_B8ms
Azure VM count: 3
Azure region: southafricanorth
Azure resource group: rg-sqlbdc
AKS cluster name: kubesqlbdc-cluster
SQL Server big data cluster name: sqlbdc-cluster
SQL Server big data cluster controller user name: admin
Deployment configuration profile: aks-dev-test
Deployment configuration: mssql-bdc-configuration


### **Login to Azure**

This will open a web browser window to enable credentials to be entered. If this cells is hanging forever, it might be because your Web browser windows is waiting for you to enter your Azure credentials!


In [5]:
cmd = f'az login'
run_command()

Executing: az login


[
  {
    "cloudName": "AzureCloud",
    "id": "a16fbc6b-dc1f-43f7-a30d-3ad9c3760594",
    "isDefault": false,
    "name": "Visual Studio Premium with MSDN",
    "state": "Enabled",
    "tenantId": "6778d36c-50b7-4320-8d62-6a5091d6e2e7",
    "user": {
      "name": "niels.it.berglund@gmail.com",
      "type": "user"
    }
  },
  {
    "cloudName": "AzureCloud",
    "id": "b7e65cb3-9829-44ea-9a90-6250cc442b3b",
    "isDefault": true,
    "name": "Microsoft Azure Sponsorship",
    "state": "Enabled",
    "tenantId": "6778d36c-50b7-4320-8d62-6a5091d6e2e7",
    "user": {
      "name": "niels.it.berglund@gmail.com",
      "type": "user"
    }
  },
  {
    "cloudName": "AzureCloud",
    "id": "4744bfbf-f324-4142-ac99-65acc840ad8d",
    "isDefault": false,
    "name": "Enterprise Dev/Test",
    "state": "Enabled",
    "tenantId": "e64b32d9-f28f-4a39-958e-b3938a47f0a9",
    "user": {
      "name": "niels.it.berglund@gmail.com",
      "type": "user"
    }
  }
]
Successfully executed: az login





### **Set active Azure subscription**

In [6]:
if azure_subscription_id != "":
    cmd = f'az account set --subscription {azure_subscription_id}'
    run_command()
else:
    print('Using the default Azure subscription', {azure_subscription_id})
cmd = f'az account show'
run_command()

Executing: az account set --subscription b7e65cb3-9829-44ea-9a90-6250cc442b3b


Successfully executed: az account set --subscription b7e65cb3-9829-44ea-9a90-6250cc442b3b
Executing: az account show


{
  "environmentName": "AzureCloud",
  "id": "b7e65cb3-9829-44ea-9a90-6250cc442b3b",
  "isDefault": true,
  "name": "Microsoft Azure Sponsorship",
  "state": "Enabled",
  "tenantId": "6778d36c-50b7-4320-8d62-6a5091d6e2e7",
  "user": {
    "name": "niels.it.berglund@gmail.com",
    "type": "user"
  }
}
Successfully executed: az account show


### **Create Azure resource group**

In [7]:
cmd = f'az group create --name {azure_resource_group} --location {azure_region}'
run_command()

Executing: az group create --name rg-sqlbdc --location southafricanorth


{
  "id": "/subscriptions/b7e65cb3-9829-44ea-9a90-6250cc442b3b/resourceGroups/rg-sqlbdc",
  "location": "southafricanorth",
  "managedBy": null,
  "name": "rg-sqlbdc",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": null,
  "type": null
}
Successfully executed: az group create --name rg-sqlbdc --location southafricanorth


### **Create AKS cluster**

In [8]:
cmd = f'az aks create --name {aks_cluster_name} --resource-group {azure_resource_group} --generate-ssh-keys --node-vm-size {azure_vm_size} --node-count {azure_vm_count}' 
run_command()

Executing: az aks create --name kubesqlbdc-cluster --resource-group rg-sqlbdc --generate-ssh-keys --node-vm-size Standard_B8ms --node-count 3


{
  "aadProfile": null,
  "addonProfiles": null,
  "agentPoolProfiles": [
    {
      "availabilityZones": null,
      "count": 3,
      "enableAutoScaling": null,
      "maxCount": null,
      "maxPods": 110,
      "minCount": null,
      "name": "nodepool1",
      "orchestratorVersion": "1.13.10",
      "osDiskSizeGb": 100,
      "osType": "Linux",
      "provisioningState": "Succeeded",
      "type": "AvailabilitySet",
      "vmSize": "Standard_B8ms",
      "vnetSubnetId": null
    }
  ],
  "apiServerAuthorizedIpRanges": null,
  "dnsPrefix": "kubesqlbdc-rg-sqlbdc-b7e65c",
  "enablePodSecurityPolicy": null,
  "enableRbac": true,
  "fqdn": "kubesqlbdc-rg-sqlbdc-b7e65c-1c563f65.hcp.southafricanorth.azmk8s.io",
  "id": "/subscriptions/b7e65cb3-9829-44ea-9a90-6250cc442b3b/resourcegroups/rg-sqlbdc/providers/Microsoft.ContainerService/managedClusters/kubesqlbdc-cluster",
  "identity": null,
  "kubernetesVersion": "1.13.10",
  "linuxProfile": {
    "adminUsername": "azureuser",
    "ssh": {

### **Set the new AKS cluster as current context**

In [9]:
cmd = f'az aks get-credentials --resource-group {azure_resource_group} --name {aks_cluster_name} --admin --overwrite-existing'
run_command()

Executing: az aks get-credentials --resource-group rg-sqlbdc --name kubesqlbdc-cluster --admin --overwrite-existing


Merged "kubesqlbdc-cluster-admin" as current context in C:\Users\niels\.kube\config
Successfully executed: az aks get-credentials --resource-group rg-sqlbdc --name kubesqlbdc-cluster --admin --overwrite-existing


### **Create a deployment configuration file**

In [10]:
os.environ["ACCEPT_EULA"] = 'yes'
cmd = f'azdata bdc config init --source {configuration_profile} --target {configuration_folder} --force'
run_command()
cmd = f'azdata bdc config replace -c {configuration_folder}/bdc.json -j metadata.name={mssql_cluster_name}'
run_command()

Executing: azdata bdc config init --source aks-dev-test --target mssql-bdc-configuration --force


mssql-bdc-configuration\bdc.json created
mssql-bdc-configuration\control.json created
Successfully executed: azdata bdc config init --source aks-dev-test --target mssql-bdc-configuration --force
Executing: azdata bdc config replace -c mssql-bdc-configuration/bdc.json -j metadata.name=sqlbdc-cluster


Successfully executed: azdata bdc config replace -c mssql-bdc-configuration/bdc.json -j metadata.name=sqlbdc-cluster


### **Create SQL Server 2019 big data cluster**

In [11]:
print (f'Creating SQL Server 2019 big data cluster: {mssql_cluster_name} using configuration {configuration_folder}')
os.environ["CONTROLLER_USERNAME"] = mssql_controller_username
os.environ["CONTROLLER_PASSWORD"] = mssql_password
os.environ["MSSQL_SA_PASSWORD"] = mssql_password
os.environ["KNOX_PASSWORD"] = mssql_password
cmd = f'azdata bdc create -c {configuration_folder}'
run_command()

Creating SQL Server 2019 big data cluster: sqlbdc-cluster using configuration mssql-bdc-configuration
Executing: azdata bdc create -c mssql-bdc-configuration


The privacy statement can be viewed at:
https://go.microsoft.com/fwlink/?LinkId=853010

The license terms for SQL Server Big Data Cluster can be viewed at:
https://go.microsoft.com/fwlink/?LinkId=2002534


Cluster deployment documentation can be viewed at:
https://aka.ms/bdc-deploy

NOTE: Cluster creation can take a significant amount of time depending on
configuration, network speed, and the number of nodes in the cluster.

Starting cluster deployment.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Waiting for cluster controller to start.
Cluster controller endpoint is available at 102.13

### **Login to SQL Server 2019 big data cluster**

In [12]:
cmd = f'azdata login --cluster-name {mssql_cluster_name}'
run_command()

Executing: azdata login --cluster-name sqlbdc-cluster


Logged in successfully to `https://102.133.233.185:30080`
Successfully executed: azdata login --cluster-name sqlbdc-cluster


### **Show SQL Server 2019 big data cluster endpoints**

In [13]:
from IPython.display import *
pandas.set_option('display.max_colwidth', -1)
cmd = f'azdata bdc endpoint list'
cmdOutput = !{cmd}
endpoints = json.loads(''.join(cmdOutput))
endpointsDataFrame = pandas.DataFrame(endpoints)
endpointsDataFrame.columns = [' '.join(word[0].upper() + word[1:] for word in columnName.split()) for columnName in endpoints[0].keys()]
display(HTML(endpointsDataFrame.to_html(index=False, render_links=True)))

Description,Endpoint,Name,Protocol
"Gateway to access HDFS files, Spark",https://102.133.228.36:30443,gateway,https
Spark Jobs Management and Monitoring Dashboard,https://102.133.228.36:30443/gateway/default/sparkhistory,spark-history,https
Spark Diagnostics and Monitoring Dashboard,https://102.133.228.36:30443/gateway/default/yarn,yarn-ui,https
Application Proxy,https://102.133.239.225:30778,app-proxy,https
Management Proxy,https://102.133.228.59:30777,mgmtproxy,https
Log Search Dashboard,https://102.133.228.59:30777/kibana,logsui,https
Metrics Dashboard,https://102.133.228.59:30777/grafana,metricsui,https
Cluster Management Service,https://102.133.233.185:30080,controller,https
SQL Server Master Instance Front-End,"102.133.235.103,31433",sql-server-master,tds
HDFS File System Proxy,https://102.133.228.36:30443/gateway/default/webhdfs/v1,webhdfs,https


### **Connect to master SQL Server instance in Azure Data Studio**
Click the link below to connect to the master SQL Server instance of the SQL Server 2019 big data cluster.

In [1]:
sqlEndpoints = [x for x in endpoints if x['name'] == 'sql-server-master']
if sqlEndpoints and len(sqlEndpoints) == 1:
    connectionParameter = '{"serverName":"' + sqlEndpoints[0]['endpoint'] + '","providerName":"MSSQL","authenticationType":"SqlLogin","userName":"sa","password":' + json.dumps(mssql_password) + '}'
    display(HTML('<br/><a href="command:azdata.connect?' + html.escape(connectionParameter)+'"><font size="3">Click here to connect to master SQL Server instance</font></a><br/>'))
else:
    sys.exit('Could not find the master SQL Server instance endpoint.')

NameError: name 'endpoints' is not defined