RUN001 - Run a notebook
=======================

Description
-----------

This notebook abstracts the mechanics of running and saving the results
of a single notebook of any kernel. It:

1.  Use a valid notebook executor based on Kerenel type
    (`azdata notebook run` or `Invoke-SqlNotebook`).
2.  Validate if a kernel threw an error (not all kernels bubble errors
    up to the executor).
3.  Save results (metrics and the .ipynb/.html output) to the Big Data
    Cluster.
    -   [RUN002]()
4.  Recursively runs (using this notebook) follow on notebooks when
    output cell content matches an ‘expert rule’ expression.
    -   [RUN003]()

If `save_results_in_storage_pool` = True, ensure the T/SQL objects have
been setup, using: - [RUN000]()

### Parameters

Description of parameters:

-   `notebook_path`: The notebook to run

-   `use_ad_auth`: Set to “True” for Kerberos, set to “False” to Basic
    Auth

-   `namespace`: If using the SQL / PySpark/ Scala kernels specify the
    Big Data Cluster namespace

-   `sql_master_pool_username`: If running the SQL kernel, specific
    username

-   `sql_master_pool_password`: If running the SQL kernel, specific
    password

-   `knox_username`: If running the PySpark / Scala kernel in secure
    (Kerberos) mode specify the Knox username (it is hardcoded to ‘root’
    in Basic auth), \# e.g. admin

-   `knox_user_domain`: i.e. AZDATA.LOCAL

-   `knox_password`: If running the PySpark / Scala kernel specify the
    Knox password

-   `save_results_in_storage_pool`: Set to “True” to save results in
    master pool & storage pool

-   `app_name`: If saving results, specify any app\_name
    (i.e. “my\_app”)

-   `app_version`: If saving results, specify any app\_version
    (i.e. “v1”)

NOTE: All types are strings, due to `azdata noteboko run --arguments`
only supporting string type.

In [None]:
import os, getpass, datetime

notebook_path = os.path.join(os.getcwd(), "run505a-sample-notebook.ipynb")

use_ad_auth = "False" if "AZDATA_AD_AUTH" not in os.environ else os.environ["AZDATA_AD_AUTH"]
namespace = "" if "AZDATA_NAMESPACE" not in os.environ else os.environ["AZDATA_NAMESPACE"]
sql_master_pool_username = "" if "AZDATA_USERNAME" not in os.environ else os.environ["AZDATA_USERNAME"]
sql_master_pool_password = "" if "AZDATA_PASSWORD" not in os.environ else os.environ["AZDATA_PASSWORD"]

knox_username = "" if "DOMAIN_SERVICE_ACCOUNT_USERNAME" not in os.environ else os.environ["DOMAIN_SERVICE_ACCOUNT_USERNAME"] 
knox_user_domain = "" if "DOMAIN_SERVICE_ACCOUNT_DOMAIN_NAME" not in os.environ else os.environ["DOMAIN_SERVICE_ACCOUNT_DOMAIN_NAME"]
knox_password = "" if "DOMAIN_SERVICE_ACCOUNT_PASSWORD" not in os.environ else os.environ["DOMAIN_SERVICE_ACCOUNT_PASSWORD"]

save_results_in_storage_pool = "True"
app_name = "app-" + getpass.getuser().lower() # set default to be app-<username>
app_version = "v1"

session_start = str(datetime.datetime.utcnow())

step = "" # If this notebook is recursively called as an 'expert rule', this is the current step (for metrics grouping  purposes)
base_notebook_name = "" # For expert rules, this is original calling notebook
parent_notebook_name = "" # For expert rules, this is the calling (parent) notebook

NOTEBOOK_CELL_TIMEOUT = 600 # Per cell timeout in seconds (10 minutes)

### Common functions

Define helper functions used in this notebook.

In [None]:
# Define `run` function for transient fault handling, suggestions on error, and scrolling updates on Windows
import sys
import os
import re
import json
import platform
import shlex
import shutil
import datetime

from subprocess import Popen, PIPE
from IPython.display import Markdown

retry_hints = {} # Output in stderr known to be transient, therefore automatically retry
error_hints = {} # Output in stderr where a known SOP/TSG exists which will be HINTed for further help
install_hint = {} # The SOP to help install the executable if it cannot be found

first_run = True
rules = None
debug_logging = False

def run(cmd, return_output=False, no_output=False, retry_count=0):
    """Run shell command, stream stdout, print stderr and optionally return output

    NOTES:

    1.  Commands that need this kind of ' quoting on Windows e.g.:

            kubectl get nodes -o jsonpath={.items[?(@.metadata.annotations.pv-candidate=='data-pool')].metadata.name}

        Need to actually pass in as '"':

            kubectl get nodes -o jsonpath={.items[?(@.metadata.annotations.pv-candidate=='"'data-pool'"')].metadata.name}

        The ' quote approach, although correct when pasting into Windows cmd, will hang at the line:
        
            `iter(p.stdout.readline, b'')`

        The shlex.split call does the right thing for each platform, just use the '"' pattern for a '
    """
    MAX_RETRIES = 5
    output = ""
    retry = False

    global first_run
    global rules

    if first_run:
        first_run = False
        rules = load_rules()

    # When running `azdata sql query` on Windows, replace any \n in """ strings, with " ", otherwise we see:
    #
    #    ('HY090', '[HY090] [Microsoft][ODBC Driver Manager] Invalid string or buffer length (0) (SQLExecDirectW)')
    #
    if platform.system() == "Windows" and cmd.startswith("azdata sql query"):
        cmd = cmd.replace("\n", " ")

    # shlex.split is required on bash and for Windows paths with spaces
    #
    cmd_actual = shlex.split(cmd)

    # Store this (i.e. kubectl, python etc.) to support binary context aware error_hints and retries
    #
    user_provided_exe_name = cmd_actual[0].lower()

    # When running python, use the python in the ADS sandbox ({sys.executable})
    #
    if cmd.startswith("python "):
        cmd_actual[0] = cmd_actual[0].replace("python", sys.executable)

        # On Mac, when ADS is not launched from terminal, LC_ALL may not be set, which causes pip installs to fail
        # with:
        #
        #    UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 4969: ordinal not in range(128)
        #
        # Setting it to a default value of "en_US.UTF-8" enables pip install to complete
        #
        if platform.system() == "Darwin" and "LC_ALL" not in os.environ:
            os.environ["LC_ALL"] = "en_US.UTF-8"

    # When running `kubectl`, if AZDATA_OPENSHIFT is set, use `oc`
    #
    if cmd.startswith("kubectl ") and "AZDATA_OPENSHIFT" in os.environ:
        cmd_actual[0] = cmd_actual[0].replace("kubectl", "oc")

    # To aid supportabilty, determine which binary file will actually be executed on the machine
    #
    which_binary = None

    # Special case for CURL on Windows.  The version of CURL in Windows System32 does not work to
    # get JWT tokens, it returns "(56) Failure when receiving data from the peer".  If another instance
    # of CURL exists on the machine use that one.  (Unfortunately the curl.exe in System32 is almost
    # always the first curl.exe in the path, and it can't be uninstalled from System32, so here we
    # look for the 2nd installation of CURL in the path)
    if platform.system() == "Windows" and cmd.startswith("curl "):
        path = os.getenv('PATH')
        for p in path.split(os.path.pathsep):
            p = os.path.join(p, "curl.exe")
            if os.path.exists(p) and os.access(p, os.X_OK):
                if p.lower().find("system32") == -1:
                    cmd_actual[0] = p
                    which_binary = p
                    break

    # Find the path based location (shutil.which) of the executable that will be run (and display it to aid supportability), this
    # seems to be required for .msi installs of azdata.cmd/az.cmd.  (otherwise Popen returns FileNotFound) 
    #
    # NOTE: Bash needs cmd to be the list of the space separated values hence shlex.split.
    #
    if which_binary == None:
        which_binary = shutil.which(cmd_actual[0])

    if which_binary == None:
        if user_provided_exe_name in install_hint and install_hint[user_provided_exe_name] is not None:
            display(Markdown(f'HINT: Use [{install_hint[user_provided_exe_name][0]}]({install_hint[user_provided_exe_name][1]}) to resolve this issue.'))

        raise FileNotFoundError(f"Executable '{cmd_actual[0]}' not found in path (where/which)")
    else:   
        cmd_actual[0] = which_binary

    start_time = datetime.datetime.now().replace(microsecond=0)

    print(f"START: {cmd} @ {start_time} ({datetime.datetime.utcnow().replace(microsecond=0)} UTC)")
    print(f"       using: {which_binary} ({platform.system()} {platform.release()} on {platform.machine()})")
    print(f"       cwd: {os.getcwd()}")

    # Command-line tools such as CURL and AZDATA HDFS commands output
    # scrolling progress bars, which causes Jupyter to hang forever, to
    # workaround this, use no_output=True
    #

    # Work around a infinite hang when a notebook generates a non-zero return code, break out, and do not wait
    #
    wait = True 

    try:
        if no_output:
            p = Popen(cmd_actual)
        else:
            p = Popen(cmd_actual, stdout=PIPE, stderr=PIPE, bufsize=1)
            with p.stdout:
                for line in iter(p.stdout.readline, b''):
                    line = line.decode()
                    if return_output:
                        output = output + line
                    else:
                        if cmd.startswith("azdata notebook run"): # Hyperlink the .ipynb file
                            regex = re.compile('  "(.*)"\: "(.*)"') 
                            match = regex.match(line)
                            if match:
                                if match.group(1).find("HTML") != -1:
                                    display(Markdown(f' - "{match.group(1)}": "{match.group(2)}"'))
                                else:
                                    display(Markdown(f' - "{match.group(1)}": "[{match.group(2)}]({match.group(2)})"'))

                                    wait = False
                                    break # otherwise infinite hang, have not worked out why yet.
                        else:
                            print(line, end='')
                            if rules is not None:
                                apply_expert_rules(line)

        if wait:
            p.wait()
    except FileNotFoundError as e:
        if install_hint is not None:
            display(Markdown(f'HINT: Use {install_hint} to resolve this issue.'))

        raise FileNotFoundError(f"Executable '{cmd_actual[0]}' not found in path (where/which)") from e

    exit_code_workaround = 0 # WORKAROUND: azdata hangs on exception from notebook on p.wait()

    if not no_output:
        for line in iter(p.stderr.readline, b''):
            try:
                line_decoded = line.decode()
            except UnicodeDecodeError:
                # NOTE: Sometimes we get characters back that cannot be decoded(), e.g.
                #
                #   \xa0
                #
                # For example see this in the response from `az group create`:
                #
                # ERROR: Get Token request returned http error: 400 and server 
                # response: {"error":"invalid_grant",# "error_description":"AADSTS700082: 
                # The refresh token has expired due to inactivity.\xa0The token was 
                # issued on 2018-10-25T23:35:11.9832872Z
                #
                # which generates the exception:
                #
                # UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 179: invalid start byte
                #
                print("WARNING: Unable to decode stderr line, printing raw bytes:")
                print(line)
                line_decoded = ""
                pass
            else:

                # azdata emits a single empty line to stderr when doing an hdfs cp, don't
                # print this empty "ERR:" as it confuses.
                #
                if line_decoded == "":
                    continue
                
                print(f"STDERR: {line_decoded}", end='')

                if line_decoded.startswith("An exception has occurred") or line_decoded.startswith("ERROR: An error occurred while executing the following cell"):
                    exit_code_workaround = 1

                # inject HINTs to next TSG/SOP based on output in stderr
                #
                if user_provided_exe_name in error_hints:
                    for error_hint in error_hints[user_provided_exe_name]:
                        if line_decoded.find(error_hint[0]) != -1:
                            display(Markdown(f'HINT: Use [{error_hint[1]}]({error_hint[2]}) to resolve this issue.'))

                # apply expert rules (to run follow-on notebooks), based on output
                #
                if rules is not None:
                    apply_expert_rules(line_decoded)

                # Verify if a transient error, if so automatically retry (recursive)
                #
                if user_provided_exe_name in retry_hints:
                    for retry_hint in retry_hints[user_provided_exe_name]:
                        if line_decoded.find(retry_hint) != -1:
                            if retry_count < MAX_RETRIES:
                                print(f"RETRY: {retry_count} (due to: {retry_hint})")
                                retry_count = retry_count + 1
                                output = run(cmd, return_output=return_output, retry_count=retry_count)

                                if return_output:
                                    return output
                                else:
                                    return

    elapsed = datetime.datetime.now().replace(microsecond=0) - start_time

    # WORKAROUND: We avoid infinite hang above in the `azdata notebook run` failure case, by inferring success (from stdout output), so
    # don't wait here, if success known above
    #
    if wait: 
        if p.returncode != 0:
            raise SystemExit(f'Shell command:\n\n\t{cmd} ({elapsed}s elapsed)\n\nreturned non-zero exit code: {str(p.returncode)}.\n')
    else:
        if exit_code_workaround !=0 :
            raise SystemExit(f'Shell command:\n\n\t{cmd} ({elapsed}s elapsed)\n\nreturned non-zero exit code: {str(exit_code_workaround)}.\n')

    print(f'\nSUCCESS: {elapsed}s elapsed.\n')

    if return_output:
        return output

def load_json(filename):
    """Load a json file from disk and return the contents"""

    with open(filename, encoding="utf8") as json_file:
        return json.load(json_file)

def load_rules():
    """Load any 'expert rules' from the metadata of this notebook (.ipynb) that should be applied to the stderr of the running executable"""

    # Load this notebook as json to get access to the expert rules in the notebook metadata.
    #
    try:
        j = load_json("run001-run-notebook.ipynb")
    except:
        pass # If the user has renamed the book, we can't load ourself.  NOTE: Is there a way in Jupyter, to know your own filename?
    else:
        if "metadata" in j and \
            "azdata" in j["metadata"] and \
            "expert" in j["metadata"]["azdata"] and \
            "expanded_rules" in j["metadata"]["azdata"]["expert"]:

            rules = j["metadata"]["azdata"]["expert"]["expanded_rules"]

            rules.sort() # Sort rules, so they run in priority order (the [0] element).  Lowest value first.

            # print (f"EXPERT: There are {len(rules)} rules to evaluate.")

            return rules

def apply_expert_rules(line):
    """Determine if the stderr line passed in, matches the regular expressions for any of the 'expert rules', if so
    inject a 'HINT' to the follow-on SOP/TSG to run"""

    global rules

    for rule in rules:
        notebook = rule[1]
        cell_type = rule[2]
        output_type = rule[3] # i.e. stream or error
        output_type_name = rule[4] # i.e. ename or name 
        output_type_value = rule[5] # i.e. SystemExit or stdout
        details_name = rule[6]  # i.e. evalue or text 
        expression = rule[7].replace("\\*", "*") # Something escaped *, and put a \ in front of it!

        if debug_logging:
            print(f"EXPERT: If rule '{expression}' satisfied', run '{notebook}'.")

        if re.match(expression, line, re.DOTALL):

            if debug_logging:
                print("EXPERT: MATCH: name = value: '{0}' = '{1}' matched expression '{2}', therefore HINT '{4}'".format(output_type_name, output_type_value, expression, notebook))

            match_found = True

            display(Markdown(f'HINT: Use [{notebook}]({notebook}) to resolve this issue.'))




print('Common functions defined successfully.')

# Hints for binary (transient fault) retry, (known) error and install guide
#
retry_hints = {'kubectl': ['A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond'], 'azdata': ['Endpoint sql-server-master does not exist', 'Endpoint livy does not exist', 'Failed to get state for cluster', 'Endpoint webhdfs does not exist', 'Adaptive Server is unavailable or does not exist', 'Error: Address already in use']}
error_hints = {'kubectl': [['no such host', 'TSG010 - Get configuration contexts', '../monitor-k8s/tsg010-get-kubernetes-contexts.ipynb'], ['No connection could be made because the target machine actively refused it', 'TSG056 - Kubectl fails with No connection could be made because the target machine actively refused it', '../repair/tsg056-kubectl-no-connection-could-be-made.ipynb']], 'azdata': [['azdata login', 'SOP028 - azdata login', '../common/sop028-azdata-login.ipynb'], ['The token is expired', 'SOP028 - azdata login', '../common/sop028-azdata-login.ipynb'], ['Reason: Unauthorized', 'SOP028 - azdata login', '../common/sop028-azdata-login.ipynb'], ['Max retries exceeded with url: /api/v1/bdc/endpoints', 'SOP028 - azdata login', '../common/sop028-azdata-login.ipynb'], ['Look at the controller logs for more details', 'TSG027 - Observe cluster deployment', '../diagnose/tsg027-observe-bdc-create.ipynb'], ['provided port is already allocated', 'TSG062 - Get tail of all previous container logs for pods in BDC namespace', '../log-files/tsg062-tail-bdc-previous-container-logs.ipynb'], ['Create cluster failed since the existing namespace', 'SOP061 - Delete a big data cluster', '../install/sop061-delete-bdc.ipynb'], ['Failed to complete kube config setup', 'TSG067 - Failed to complete kube config setup', '../repair/tsg067-failed-to-complete-kube-config-setup.ipynb'], ['Error processing command: "ApiError', 'TSG110 - Azdata returns ApiError', '../repair/tsg110-azdata-returns-apierror.ipynb'], ['Error processing command: "ControllerError', 'TSG036 - Controller logs', '../log-analyzers/tsg036-get-controller-logs.ipynb'], ['ERROR: 500', 'TSG046 - Knox gateway logs', '../log-analyzers/tsg046-get-knox-logs.ipynb'], ['Data source name not found and no default driver specified', 'SOP069 - Install ODBC for SQL Server', '../install/sop069-install-odbc-driver-for-sql-server.ipynb'], ["Can't open lib 'ODBC Driver 17 for SQL Server", 'SOP069 - Install ODBC for SQL Server', '../install/sop069-install-odbc-driver-for-sql-server.ipynb'], ['Control plane upgrade failed. Failed to upgrade controller.', 'TSG108 - View the controller upgrade config map', '../diagnose/tsg108-controller-failed-to-upgrade.ipynb']]}
install_hint = {'kubectl': ['SOP036 - Install kubectl command line interface', '../install/sop036-install-kubectl.ipynb'], 'azdata': ['SOP063 - Install azdata CLI (using package manager)', '../install/sop063-packman-install-azdata.ipynb']}

### Is notebook being run inside a Kubernetes cluster

When this is notebook is running inside a Kubernetes cluster, such as
when running inside an App-Deploy pod, there is no KUBECONFIG present,
therefore azdata login needs to use the -e (endpoint) approach to login.

In [None]:
import os

if "KUBERNETES_SERVICE_PORT" in os.environ and "KUBERNETES_SERVICE_HOST" in os.environ:
    inside_kubernetes_cluster = True
else:
    inside_kubernetes_cluster = False

### Notebook filename management

The notebook to be run has many filenames over the life of it being run.
Here a class is defined to abstract the mechanics of these filenames.

In [None]:
import os

class NotebookPath:

    _full_path = None

    _azdata_logging_path = None

    _temp_input_full_path = None

    _output_filename = None
    _output_dir = None
    _output_full_path = None

    _step = None

    def __init__(self, full_path, step):
        self._full_path = full_path
        self._output_filename = f"output-{os.path.basename(full_path)}"
        self._output_dir = os.getcwd()
        self._output_full_path = os.path.join(self._output_dir, self._output_filename)
        self._azdata_logging_path = os.path.join(self._output_dir, "tmp", "azdata_log-" + os.path.basename(full_path)[:-6])
        self._temp_input_full_path = os.path.join(self._output_dir, "modified-" + os.path.basename(full_path))
        self._temp_output_full_path = os.path.join(self._output_dir, "output-modified-" + os.path.basename(full_path))

        # If this notebook name starts with "step", then we are on a new step, so get the number
        #
        if os.path.basename(full_path).startswith("step"):
            self._step = os.path.basename(full_path)[4:7] # i.e. "001"
        else:
            self._step = step

    @property
    def full_path(self):
        return self._full_path.replace('\\', '\\\\')

    @property
    def step(self):
        return self._step

    @property
    def azdata_logging_path(self):
        """To allow for concurrent execution of notebooks, ensure each notebook
           has it's own azdata.log flie
        """
        return self._azdata_logging_path.replace('\\', '\\\\')

    @property
    def temp_input_full_path(self):
        """The filename of the temporary file created to run the notebook (which may be
           a modified version of the original notebook, to enable authentication)
        """
        return self._temp_input_full_path.replace('\\', '\\\\')

    @property
    def temp_output_full_path(self):
        return self._temp_output_full_path.replace('\\', '\\\\')

    @property
    def temp_output_full_path_html(self):
        return self._temp_output_full_path.replace(".ipynb", ".html").replace('\\', '\\\\')

    @property
    def output_dir(self):
        return self._output_dir.replace("\\", "\\\\")

    @property
    def output_full_path(self):
        """The output filename, that is saved/uploaded to the Big Data Cluster
        """ 
        return self._output_full_path.replace('\\', '\\\\')

    @property
    def output_full_path_html(self):
        return self._output_full_path.replace(".ipynb", ".html").replace('\\', '\\\\')

    @property
    def output_full_path_to_pass_as_azdata_arg(self):
        """When passing --arguments to azdata, an extra level of \ escaping is needed
        """
        return self.output_full_path.replace('\\', '\\\\')

input_notebook = NotebookPath(notebook_path, step)

print(f"full_path: {input_notebook.full_path}")
print(f"output_dir: {input_notebook.output_dir}")
print(f"output_full_path: {input_notebook.output_full_path}")
print(f"output_full_path_to_pass_as_azdata_arg: {input_notebook.output_full_path_to_pass_as_azdata_arg}")

### Set the `azdata` logging directory

To support running multiple creates at the same time, place the
azdata.log separately. This code is placed here, so it runs after
‘injected parameters’ (which may change the app\_name/app\_version)

In [None]:
os.environ["AZDATA_LOGGING_LOG_DIR"] = input_notebook.azdata_logging_path

print("Set AZDATA_LOGGING_LOG_DIR: " + input_notebook.azdata_logging_path)

### Notebook JSON management

To run the notebook, the metadata of the notebook is inspected to make
decisions on what kernel type is required, what internal parameter
values are needed at exectuuon time, and what follow on expert rules
need to be evaluated.

Here a class is defined to abtract the mechanics of inspecting the
Notebook JSON.

In [None]:
import json

class NotebookJson:

    _json = None

    def __init__(self, full_path):
        self._json = NotebookJson._load_json(full_path)

    @property
    def kernel_name(self):
        return self._json["metadata"]["kernelspec"]["name"].lower()

    @property
    def timeout(self):
        if "azdata" in self._json["metadata"]:
            if "timeout" in self._json["metadata"]["azdata"]:
                return int(self._json["metadata"]["azdata"]["timeout"])
            else:
                return NOTEBOOK_CELL_TIMEOUT
        else:
            return NOTEBOOK_CELL_TIMEOUT

    @property
    def internal_parameters(self):
        cmdline_args = ""

        if "azdata" in self._json["metadata"]:
            if "internal" in self._json["metadata"]["azdata"]:
                if "parameters" in self._json["metadata"]["azdata"]["internal"]:
                    parameters = self._json["metadata"]["azdata"]["internal"]["parameters"]

                    cmdline_args = str(parameters).replace("'", '\\"') # Windows cmd line, requires ", not '

        if cmdline_args != "":
            cmdline_args = '--arguments "' + cmdline_args + '"'

        return cmdline_args

    @property
    def json(self):
        return self._json

    @property
    def has_expert_rules(self):
        if "metadata" in self._json and \
            "azdata" in self._json["metadata"] and \
            "expert" in self._json["metadata"]["azdata"] and \
            "expanded_rules" in self._json["metadata"]["azdata"]["expert"]:

            rules = self._json["metadata"]["azdata"]["expert"]["expanded_rules"]

            return len(rules) > 0
        else:
            return False

    def save_as(self, filename):
        NotebookJson._save_json(filename, self._json)

    @staticmethod
    def _load_json(filename):
        with open(filename, encoding="utf8") as json_file:
            return json.load(json_file)

    @staticmethod
    def _save_json(filename, contents):
        with open(filename, 'w', encoding="utf8") as outfile:
            json.dump(contents, outfile, indent=4)

input_json = NotebookJson(input_notebook.full_path)

print(f"Kernel type: {input_json.kernel_name}")

### Run a SQL Kernel notebook

Notebooks that use the SQL kernel cannot be run using
`azdata notebook run`, therefore the Powershell method
Invoke-SqlNotebook is used.

Here the command line to execute a SQL Kernel notebook is built up:

In [None]:
def get_sql_kernel_cmd_line(input_file, output_file):
    exit_code = 0

    if inside_kubernetes_cluster:
        sql_server_master_endpoint = "master-p-svc,1433"
    else:
        endpoint = run('azdata bdc endpoint list --endpoint="sql-server-master"', return_output=True)
        endpoint = json.loads(endpoint)
        sql_server_master_endpoint = endpoint['endpoint']

    print (f"The sql-server-master endpoint: {sql_server_master_endpoint}")

    if platform.system() == "Windows":
        powershell_cmd = "powershell"
    else:
        powershell_cmd = "pwsh" # on Linux powershell is called 'pwsh'!

    return f"""{powershell_cmd} -ExecutionPolicy Bypass -Command "Invoke-SqlNotebook -InputFile {input_file} -ServerInstance \\\"{sql_server_master_endpoint}\\\" -Username {sql_master_pool_username} -Password {sql_master_pool_password} -Force -OutputFile {output_file}" """

print("Function `get_sql_kernel_cmd_line` defined")

### Check a SQL Kernel notebook for errors

The SQL Kernel does not return a non-zero exit code on cell error,
therefore the notebook output will be inspected to look for an
`output_type` of `error` and will print the `evalue` and will return a
non-zero exit\_code on error.

In [None]:
def check_sql_kernel_for_error(j):
    exit_code = 0

    for cell in j["cells"]:
        if cell["cell_type"] == "code":
            if "outputs" in cell:
                for output in cell["outputs"]:
                    if output["output_type"] == "error":
                        print(output["evalue"])
                        exit_code = 1
                        break

    return exit_code

print("Function `check_sql_kernel_for_error` defined")

### Inject authentication for PySpark/Scala kernel based notebooks

If the notebook to run is of the PySpark or Scala kernel type, a code
cell is added to perform the authenticated connection
(\_do\_not\_call\_change\_endpoint)

NOTE: Long term, `azdata notebook run` should do this, at which time
this code can be removed

NOTE: ADS stamps the notebook kernel name as “pyspark3kernel”, which
does not work in `azdata notebook run`, which is expecting the name
“pysparkkernel”, and will result in the error:

    jupyter_client.kernelspec.NoSuchKernel: No such kernel named pyspark3kernel

The following command installs the pysparkkernel kerenel in an
app-deploy container

    /opt/azdata/bin/python3 /opt/azdata/bin/jupyter-kernelspec install --user /opt/azdata/lib/python3.6/site-packages/sparkmagic/kernels/pysparkkernel

Here we set the kernel name back to “pysparkkernel”

In [None]:
def inject_cell_to_perform_connection(kernel_name, j):

    if kernel_name == "pyspark3kernel":
        j["metadata"]["kernelspec"]["name"] = "pysparkkernel"

    insert_position = 0
    for cell in j["cells"]:
        if cell["cell_type"] == "code":
            break
        insert_position += 1

    if inside_kubernetes_cluster:
        set_endpoint_cmd = "%_do_not_call_change_endpoint --server=https://gateway-svc:8443/gateway/default/livy/v1 "
    else:
        endpoint = run('azdata bdc endpoint list --endpoint="webhdfs"', return_output=True)
        endpoint = json.loads(endpoint)
        set_endpoint_cmd = endpoint['endpoint']

    if knox_password == "":
        raise SystemExit(f"knox_password must be provided for Basic_Access when using kernel type: {kernel_name}")

    if "AZDATA_AD_AUTH" in os.environ:
        if knox_username == "":
            raise SystemExit(f"knox_username must be provided for AD Auth when using kernel type: {kernel_name}")

        if knox_user_domain == "":
            raise SystemExit(f"knox_user_domain must be provided for AD Auth when using kernel type: {kernel_name}")

        # Use the ! command to do the kinit (the run function doesn't work with the "echo |")
        #
        !echo {knox_password} | kinit {knox_username}@{knox_user_domain}
        set_endpoint_cmd += "--auth=Kerberos"
    else:
        set_endpoint_cmd += f"--auth=Basic_Access --username=root --password={knox_password}"

    j["cells"].insert(insert_position, {
        "cell_type": "code",
        "execution_count": None,
        "metadata": {},
        "outputs": [],
        "source": [ set_endpoint_cmd ]
        })

print("Function `inject_cell_to_perform_connection` defined")

### Check Spark (Scala/PySpark) kernel output for error

The Spark (Scala) Kernel does not return a non-zero exit code on cell
error, therefore look for an `output_type` of `stream` with a `stderr`
and print the `text` and return a non-zero exit\_code.

In [None]:
def check_for_error(j):
    exit_code = 0

    for cell in j["cells"]:
        if cell["cell_type"] == "code":
            if "outputs" in cell:
                for output in cell["outputs"]:
                    if output["output_type"] == "stream" and output["name"] == "stderr":
                        exit_code = 1
                        break

    return exit_code

print("Function `check_for_error` defined")

### Run the notebook

Run the notebook, and raise an exception if a non zero exit code is
returned. The caller should catch the exception (SystemExit) and
preserve this notebook output for offline inspection.

NOTE: Different ‘kernel’ types need different execution environments
(i.e. sql kernel runs via `Invoke-SqlNotebook` (powershell), most others
run in `azdata notebook run`)

In [None]:
from shutil import move, rmtree

exit_code = 0

if input_json.kernel_name == "sql":
    cmd_line = get_sql_kernel_cmd_line(input_notebook.full_path, input_notebook.output_full_path)
else:
    # Server side kernels need to have a cell injected to perform auth
    #
    if input_json.kernel_name in ["pyspark3kernel", "pysparkkernel", "sparkkernel"]:
        inject_cell_to_perform_connection(input_json.kernel_name, input_json.json)

    # Save now modified notebook as a temporary file, to preseve the original
    #
    input_json.save_as(input_notebook.temp_input_full_path)

    cmd_line = "azdata notebook run --path {0} --output-html --output-path {1} --timeout {2} {3}".format(
            input_notebook.temp_input_full_path, 
            input_notebook.output_dir, 
            input_json.timeout, 
            input_json.internal_parameters)

start = datetime.datetime.utcnow()

try:
    run(cmd_line)
except SystemExit as ex:
    print(ex)
    exit_code = 1

end = datetime.datetime.utcnow()

# Delete the modified input file (it's not longer needed)
#
if os.path.exists(input_notebook.temp_input_full_path):
    os.remove(input_notebook.temp_input_full_path)

# Rename the output-modified-*.ipynb file to output-* (which is the expected name)
#
if os.path.exists(input_notebook.temp_output_full_path):
    move(input_notebook.temp_output_full_path, input_notebook.output_full_path)

# Rename the output-modified-*.html file to output-* (which is the expected name)
#
if os.path.exists(input_notebook.temp_output_full_path_html):
    move(input_notebook.temp_output_full_path_html, input_notebook.output_full_path_html)

# Some kernels don't check for errors, so do that here and return non-zero exit code
#
if input_json.kernel_name == "sql" and exit_code == 0:
    output_json = NotebookJson(input_notebook.output_full_path)
    exit_code = check_sql_kernel_for_error(output_json.json)

if input_json.kernel_name in ["pyspark3kernel", "pysparkkernel", "sparkkernel"] and exit_code == 0:
    output_json = NotebookJson(input_notebook.output_full_path)
    exit_code = check_for_error(output_json.json)

output_json = None

### Record the results

Save the notebook .ipynb/html output files to the Storage Pool (HDFS),
and record the metrics in the Master Pool (‘runner’ database)

In [None]:
from shutil import copyfile

# If this is the first notebook to be called in the chain (which can be caused by expert rules), then
# set the base name
#
if base_notebook_name == "":
    base_notebook_name = os.path.basename(notebook_path).replace(".ipynb", "")

if save_results_in_storage_pool == "True":
    print("save_results_in_storage_pool: True")

    args = { 
        "session_start": str(session_start), 
        "notebook_path": input_notebook.output_full_path_to_pass_as_azdata_arg, 
        "step": input_notebook.step,
        "app_name": app_name,
        "app_version": app_version,
        "exit_code": str(exit_code),
        "base_notebook_name": base_notebook_name,
        "parent_notebook_name": parent_notebook_name,
        "start": str(start), 
        "end": str(end)
    }

    args = str(args).replace("'", '\\"') # Windows cmd line, requires ", not '

    # Create a copy of run002, so the results for each notebook are seperated
    #
    run002_copy_full_path = "run002-save-result-in-bdc-" + os.path.basename(notebook_path)

    # In the app-deploy app folder, the .ipynbs are flattened into one folder, in the book, they are in folders.
    #
    if os.path.exists("run002-save-result-in-bdc.ipynb"):
        copyfile("run002-save-result-in-bdc.ipynb", run002_copy_full_path)
    else:
        copyfile(os.path.join("..", "notebook-runner", "run002-save-result-in-bdc.ipynb"), run002_copy_full_path)
    
    run('azdata notebook run --path {0} --output-html --output-path {1} --timeout {2} --arguments "{3}"'.format(
        run002_copy_full_path,
        input_notebook.output_dir,
        input_json.timeout,
        args))

    os.remove(run002_copy_full_path)

else:
    print("save_results_in_storage_pool: False")

### Run expert rules

If this notebook contains expert rules, run any follow on notebooks
(i.e. SOPs/TSGs) where expert rule expressions match the output of this
notebook.

In [None]:
if input_json.has_expert_rules:
    print("Running expert rules")

    args = { 
        "session_start": str(session_start), 
        "notebook_path": input_notebook.output_full_path_to_pass_as_azdata_arg,
        "step": input_notebook.step,
        "base_notebook_name": base_notebook_name,
        "parent_notebook_name": os.path.basename(notebook_path).replace(".ipynb", ""),
        "app_name": app_name,
        "app_version": app_version,
        "save_results_in_storage_pool": save_results_in_storage_pool
    }

    args = str(args).replace("'", '\\"') # Windows cmd line, requires ", not '

    # Create a copy of run003, so the results for each notebook are seperated
    #
    run003_copy_full_path = "run003-run-expert-rules-" + os.path.basename(notebook_path)

    # In the app-deploy app folder, the .ipynbs are flattened into one folder, in the book, they are in folders.
    #
    if os.path.exists("run003-run-expert-rules.ipynb"):
        copyfile("run003-run-expert-rules.ipynb", run003_copy_full_path)
    else:
        copyfile(os.path.join("..", "notebook-runner", "run003-run-expert-rules.ipynb"), run003_copy_full_path)
  
    run('azdata notebook run --path {0} --output-html --output-path {1} --timeout {2} --arguments "{3}"'.format(
        run003_copy_full_path,
        input_notebook.output_dir, 
        input_json.timeout,
        args))

    os.remove(run003_copy_full_path)

else:
    print("Notebook has no expert rules")

### Raise exception if notebook return a non-zero exit code

In [None]:
if exit_code != 0:
    raise SystemExit(f'{cmd_line}: returned non-zero exit code: {str(exit_code)}.')

In [None]:
print('Notebook execution complete.')

Related
-------

-   [RUN000 - Setup Master Pool runner
    infrastructure](../notebook-runner/run000-setup-infrastructure.ipynb)

-   [RUN002 - Save result in Big Data
    Cluster](../notebook-runner/run002-save-result-in-bdc.ipynb)

-   [RUN003 - Run expert
    rules](../notebook-runner/run003-run-expert-rules.ipynb)