![11811317_10153406249401648_2787740058697948111_n](https://raw.githubusercontent.com/Microsoft/sqlworkshops/master/graphics/solutions-microsoft-logo-small.png)

# View the status of your big data cluster
This notebook allows you to see the status of the controller, master instance, and pools in your SQL Server big data cluster.

## <span style="color:red">Important Instructions</span>
### **Before you begin, you will need:**
* Big data cluster name
* Controller username
* Controller password
* Controller endpoint 

You can find the controller endpoint from the big data cluster dashboard in the Service Endpoints table. The endpoint is listed as **Cluster Management Service.**

If you do not know the credentials, ask the admin who deployed your cluster.

### **Instructions**
* For the best experience, click **Run Cells** on the toolbar above. This will automatically execute all code cells below and show the cluster status in each table.
* When you click **Run Cells** for this Notebook, you will be prompted at the *Log in to your big data cluster* code cell to provide your login credentials. Follow the prompts and press enter to proceed.
* **You won't need to modify any of the code cell contents** in this Notebook. If you accidentally made a change, you can reopen this Notebook from the bdc dashboard.




## **Dependencies**

> This Notebook will try to install these dependencies for you.

----------------------------------------------------------------
<table>
<colgroup>
<col style="width: 10%" />
<col style="width: 10%" />
<col style="width: 10%" />
<col style="width: 10%" />
</colgroup>
<thead>
<tr class="header">
<th>Tool</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>mssqlctl</strong></td>
<td>Yes</td>
<td>Command-line tool for installing and managing a big data cluster.</td>
</tr>
<tr class="even">
<td><strong>pandas</strong></td>
<td>Yes</td>
<td>Python Library for formatting data (<a href="https://www.learnpython.org/en/Pandas_Basics">More info</a>).</td>
</tr>
</tbody>
</table>
<p>

### **Install latest version of mssqlctl**

In [0]:
import sys, platform

if platform.system()=="Windows":
    user = ' --user'
else:
    user = ''

def executeCommand(cmd, successMsgs, printMsg):
    print(printMsg)
    cmdOutput = !{cmd}
    cmdOutput = ''.join(cmdOutput)
    if any(msg in cmdOutput for msg in successMsgs):
        print(f"\nSuccess >> " + cmd)
    else:
        raise SystemExit(f'\nFailed during:\n\n\t{cmd}\n\nreturned: \n' + ''.join(cmdOutput) + '.\n')

installPath = 'https://private-repo.microsoft.com/python/ctp3.1/mssqlctl/requirements.txt'
cmd = f'{sys.executable} -m pip uninstall --yes mssqlctl-cli-storage'
cmdOutput = !{cmd}

cmd = f'{sys.executable} -m pip uninstall -r {installPath} --yes'
executeCommand(cmd, ['is not installed', 'Successfully uninstalled mssqlctl'], 'Uninstalling mssqlctl:')

cmd = f'{sys.executable} -m pip install -r {installPath}{user} --trusted-host helsinki'
cmdOutput = !{cmd}
executeCommand(cmd, ['Requirement already satisfied', 'Successfully installed mssqlctl'], 'Installing the latest version of mssqlctl:')

### **Install latest version of pandas**

In [0]:
#install pandas
cmd = f'{sys.executable} -m pip show pandas'
cmdOutput = !{cmd}
if len(cmdOutput) > 0 and '0.24' in cmdOutput[1]:
    print('Pandas required version is already installed!')
else:
    pandasVersion = 'pandas==0.24.2'
    cmd = f'{sys.executable} -m pip install {pandasVersion}'
    cmdOutput = !{cmd}
    print(f'\nSuccess: Upgraded pandas.')

## **Log in to your big data cluster**
To view cluster status, you will need to connect to your big data cluster through mssqlctl. 

When you run this code cell, you will be prompted for:
- Cluster name
- Controller username
- Controller password

To proceed:
- **Click** on the input box
- **Type** the login info
- **Press** enter.

If your cluster is missing a configuration file, you will be asked to provide your controller endpoint. (Format: **https://00.00.00.000:00000**)

In [0]:
import os, getpass, json
import pandas as pd
import numpy as np
from IPython.display import *

def PromptForInfo(promptMsg, isPassword, errorMsg):
    if isPassword:
        promptResponse = getpass.getpass(prompt=promptMsg)
    else:
        promptResponse = input(promptMsg)
    if promptResponse == "":
        raise SystemExit(errorMsg + '\n')
    return promptResponse

# Prompt user inputs:
cluster_name = PromptForInfo('Please provide your Cluster Name: ', False, 'Cluster Name is required!')

controller_username = PromptForInfo('Please provide your Controller Username for login: ', False, 'Controller Username is required!')

controller_password = PromptForInfo('Controller Password: ', True, 'Password is required!')
print('***********')

!mssqlctl logout
# Login in to your big data cluster 
cmd = f'mssqlctl login -n {cluster_name} -u {controller_username} -a yes'
print("Start " + cmd)
os.environ['CONTROLLER_USERNAME'] = controller_username
os.environ['CONTROLLER_PASSWORD'] = controller_password
os.environ['ACCEPT_EULA'] = 'yes'

loginResult = !{cmd}
if 'ERROR: Please check your kube config or specify the correct controller endpoint with: --controller-endpoint https://<ip>:<port>.' in loginResult[0] or 'ERROR' in loginResult[0]:
    controller_ip = input('Please provide your Controller endpoint: ')
    if controller_ip == "":
        raise SystemExit(f'Controller IP is required!' + '\n')
    else:
        cmd = f'mssqlctl login -n {cluster_name} -e {controller_ip} -u {controller_username} -a yes'
        loginResult = !{cmd}
print(loginResult)


## **Status of big data cluster**
After you successfully login to your bdc, you can view the overall status of each container before drilling down into each component.

In [0]:
# Display status of big data cluster
def formatColumnNames(column):
    return ' '.join(word[0].upper() + word[1:] for word in column.split())

pd.set_option('display.max_colwidth', -1)
def show_results(input):
    input = ''.join(input)
    results = json.loads(input)
    df = pd.DataFrame(results)
    df.columns = [formatColumnNames(n) for n in results[0].keys()]
    mydata = HTML(df.to_html(render_links=True))
    display(mydata)

results  = !mssqlctl bdc status show
strRes = ''.join(results)
jsonRes = json.loads(strRes)
dtypes = '{'
spark = [x for x in jsonRes if x['kind'] == 'Spark']
if spark:
    spark_exists = True
else:
    spark_exists = False
show_results(results)

## **Cluster Status**
For each cluster component below, running each code cell will generate a table. This table will include:

----------------------------------------------------------------
<table>
<colgroup>
<col style="width: 10%" />
<col style="width: 10%" />
</colgroup>
<thead>
<tr class="header">
<th>Column Name</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><strong>Kind</strong></td>
<td>Identifies if component is a pod or a set.</td>
</tr>
<tr class="even">
<td><strong>LogsURL</strong></td>
<td>Link to <a href="https://www.elastic.co/guide/en/kibana/current/introduction.html">Kibana</a> logs which is used for troubleshooting.</td>
</tr>
<tr class="odd">
<td><strong>Name</strong></td>
<td>Provides the specific name of the pod or set.</td>
</tr>
<tr class="even">
<td><strong>NodeMetricsURL</strong></td>
<td>Link to <a href="https://grafana.com/docs/guides/basic_concepts/">Grafana</a> dashboard to view key metrics of the node.</td>
</tr>
<tr class="odd">
<td><strong>SQLMetricsURL</strong></td>
<td>Link to <a href="https://grafana.com/docs/guides/basic_concepts/">Grafana</a> dashboard to view key metrics of the SQL instance</td>
</tr>
<tr class="even">
<td><strong>State</strong></td>
<td>Indicates state of the pod or set..</td>
</tr>
</tbody>
</table>
<p>

### **Controller status**
To learn more about the controller, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-controller?view=sql-server-ver15)

In [0]:
# Display status of controller
results = !mssqlctl bdc control status show
show_results(results)

### **Master Instance status**
To learn more about the master instance, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-master-instance?view=sqlallproducts-allversions)

In [0]:
# Display status of master instance
results = !mssqlctl bdc pool status show -k master -n default
show_results(results)

### **Compute Pool status**
To learn more about compute pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-compute-pool?view=sqlallproducts-allversions)

In [0]:
# Display status of compute pool
results = !mssqlctl bdc pool status show -k compute -n default
show_results(results)

### **Storage Pool status**
To learn more about storage pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-storage-pool?view=sqlallproducts-allversions)

In [0]:
# Display status of storage pools
results = !mssqlctl bdc pool status show -k storage -n default
show_results(results)

### **Data Pool status**
To learn more about data pool, [read here.](https://docs.microsoft.com/sql/big-data-cluster/concept-data-pool?view=sqlallproducts-allversions)

In [0]:
# Display status of data pools
results = !mssqlctl bdc pool status show -k data -n default
show_results(results)

### **Spark Pool status**
Displays status of spark pool if it exists. Otherwise, will show as "No spark pool."

In [0]:
# Display status of spark pool
if spark_exists:
    results = !mssqlctl bdc pool status show -k spark -n default
    show_results(results)
else:
    print('No spark pool.')