# Morpheus Murano App 1.0.0

This Murano App uses a [microk8s cluster](https://microk8s.io/) to provide a space to test NVIDIA Morpheus on the Nectar cloud. To seperate Morpheus from any other deployments you may wish to test, these deployments use the `morpheus` namespace. 

This app includes a jupyter lab server to make things more user friendly during experimentation and NginX to make accessing it and the MLFlow UI easier. Jupyter lab is running in the `jupyter` tmux session. If this server shuts down for whatever reason, logging into the instance via ssh should reboot the server when the `.bashrc` script runs.

The morpheus stack includes:

- [MLFlow](https://mlflow.org/) to manage models
- [Triton](https://developer.nvidia.com/nvidia-triton-inference-server) as an inference server
- [Kafka](https://kafka.apache.org/) as a data broker

You can check the health of the cluster using:
`microk8s kubectl -n morpheus get all`

While MLFlow does come with a UI, we have not made it available since there is no easy way to protect it.

In [2]:
!microk8s kubectl -n morpheus get all

'microk8s' is not recognized as an internal or external command,
operable program or batch file.


We will need to wait for Morpheus to finish initialising. This will be signified by all pods having the Running status. (Approx. 10 minutes)

In [1]:
from tqdm import tqdm
from time import sleep

prev = [0]
running = 0

# Use the space used up on the disk as a proxy for measuring the download progress
with tqdm(total=63000000) as pbar:
    # Continue until all 5 containers are up
    while int(running[0]) < 5:
        # Get the current disk usage and update according to download
        curr = !df | grep / dev/vda1 | awk '{print $3}'
        pbar.update(int(curr[0]) - int(prev[0]))
        prev = curr
        sleep(1)
        
        # Check how many pods are up
        running = !microk8s kubectl get pods -n morpheus | grep -c 'Running'
        try: 
            running = int(running[0])
        except:
            running = 0


pbar.close()


'while' is not recognized as an internal or external command,
operable program or batch file.


For more information on usage, have a read of the [quick start guide](https://docs.nvidia.com/morpheus/morpheus_quickstart_guide.html) which this notebook is based on.

There are two key differences between the quick start guide and this notebook. 

1. This app uses microk8s, so we will need to prepend `microk8s` before every `kubectl` command
2. We have reimplemented the guide as a notebook to make create a more user friendly introduction. Morpheus was not made with notebooks in mind, so there are some things that we've had to do in order to run things in notebook shells rather than do things interactively in the containers. (Note the aliases below) 

NVIDIA recommends that you only run one pipeline at once, this means that usually you would need to uninstall any active pipelines before replacing it during testing. However, we are working with the SDK through this notebook. By default this app is running the the `Morpheus SDK Client` in 'sleep mode' under the `helper` release name. Rather than destroying and recreating the sdk, we will be interfacing with it through the CLI directly, building pipelines with the CLI. Simply stop the running cell and make edits as desired.


# Using Morpheus

Morpheus can be run in two ways:

- Run off file
  
    NVIDIA provides several example models and datasets to test morpheus with. You can browse these on their [GitHub](https://github.com/nv-morpheus/Morpheus/tree/branch-22.06/models). The quick start guide includes several examples which involve loading in this data into the input topics and storing outputs in a file. You can also upload your own data to the container using Jupyter's interface.

- Run off kafka topic
  
    NVIDIA Morpheus can interface with the kafka data broker, pulling new data from an input topic and pushing inferences to an output topic. NVIDIA provides scripts to load data from a file into the input topic to simulate a data stream, but in production this should be gathered from the system you are monitoring.


This implementation of Morpheus is deployed using five containers:

- The *ai-engine* container, which runs NVIDIA Triton, listening for HTTP requests on port 8000, gRPC on port 8001 and metrics in the prometheus format on port 8002.
- The *sdk-cli-helper* container, which runs the SDK and the Jupyter Lab (this container). We are using NginX to manage the connections to Jupyter to make your life easier.
- The *broker* container, which runs Kafka, listening on port 9092. It is also exposed on 30092 on the main machine allowing you to feed it data, though it will be blocked by default by Nectar's security. You will need to enable this security rule if you'd like to do this, ensure that you are not working with sensitive data if you'd like to experiment with this.
- The *zookeeper* which is a dependency of Kafka. It is used for synchronisation within distributed systems.
- The *mlflow* container, which runs MLFlow, listening on port 5000. It's UI is also exposed on 30500 on the main machine allowing you to view the models you have deployed.

In [None]:
# Let's set up some aliases to execute commands in the containers to make our lives easier

# Execute a command on the sdk pod
x_sdk = "microk8s kubectl -n morpheus exec -it sdk-cli-helper -- "

# Execute a command with the morpheus SDK
x_morpheus = "microk8s kubectl -n morpheus exec -it sdk-cli-helper -- /opt/conda/envs/morpheus/bin/morpheus"

# Run a command on the broker pod
x_broker = "microk8s kubectl -n morpheus exec deploy/broker -c broker -- "

# Run a command using MLFlow's python instance
x_mlflow_python = "microk8s kubectl -n morpheus exec -it deploy/mlflow -- /opt/conda/envs/mlflow/bin/python"

# Run a command with the MLFlow SDK
x_mlflow = "microk8s kubectl -n morpheus exec -it deploy/mlflow -- /opt/conda/envs/mlflow/bin/mlflow"

# Sensitive Information Detection (SID)

NVIDIA has provided us with a set of example models and datasets to get started in the `/models` and `/models/datasets` directory of the SDK container. In this example we will look at the SID model.

To share these files with MLFlow, copy it to the `/common` directory which is mapped to `/opt/morpheus/common` on the host. 

In [None]:
!$x_sdk cp -RL /workspace/models /common

## MLFlow

The MLFlow service is used for managing and deploying models. We can use NVIDIA's scripts to deploy models to the Triton ai-engine.

In [None]:
!$x_mlflow_python publish_model_to_mlflow.py \
      --model_name sid-minibert-onnx \
      --model_directory /common/models/triton-model-repo/sid-minibert-onnx \
      --flavor triton 

In [None]:
!$x_mlflow deployments create -t triton \
      --flavor triton \
      --name sid-minibert-onnx \
      -m models:/sid-minibert-onnx/1 \
      -C "version=1"

## Our First Pipeline
Now that we've finished setting up, we can set up some pipelines! Morpheus constructs pipelines made up of 'stages', including preprocessing and postprocessing steps accellerated with NVIDIA RAPIDS, which will allow you to build something to handle your data stream in near real time. If Morpheus's provided pipeline stages do not fit your needs, it also allows you to extend its capabilities with a [custom stage written in Python or C++](https://docs.nvidia.com/morpheus/developer_guide/guides/1_simple_python_stage.html#background).

Morpheus provides scripts to simulate an input stream from a file or by streaming a file into the data broker, but in a production setting you would feed your data into the pipeline using the input topic we set up earlier.

For more examples of these pipelines have a read through these [example workflows](https://docs.nvidia.com/morpheus/morpheus_quickstart_guide.html#example-workflows).

In [None]:
!$x_sdk ls examples/data/

In [None]:
# Simulate a datastream using the pcap_dump dataset file
!$x_morpheus --log_level=DEBUG run \
      --num_threads=3 \
      --edge_buffer_size=4 \
      --use_cpp=True \
      --pipeline_batch_size=1024 \
      --model_max_batch_size=32 \
      pipeline-nlp \
        --model_seq_length=256 \
        from-file --filename=./examples/data/pcap_dump.jsonlines \
        monitor --description 'FromFile Rate' --smoothing=0.001 \
        deserialize \
        preprocess --vocab_hash_file=data/bert-base-uncased-hash.txt --truncation=True --do_lower_case=True --add_special_tokens=False \
        monitor --description='Preprocessing rate' \
        inf-triton --force_convert_inputs=True --model_name=sid-minibert-onnx --server_url=ai-engine:8001 \
        monitor --description='Inference rate' --smoothing=0.001 --unit inf \
        add-class \
        serialize --exclude '^ts_' \
        to-file --filename=/common/data/sid-minibert-onnx-output.jsonlines --overwrite

Let's pause here and examine what's printed out in the console. 

Morpheus needs a 'labels' file to map inferences from a class to a human readable label. If one is not provided, it will look in the default location. 

Next Morpheus outputs its config: it will report the type of pipeline it's using (NLP) and the parameters we've chosen for inference (batch size, threads etc)

Then Morpheus will output its progress in building the pipeline, this output will let you know which stage has failed if the pipeline cannot be built for whatever reason.

Finally the will begin processing. Since we've asked the pipeline to monitor the rate of processing the data, we see these outputs here.

## A closer look
At this stage you may want to explore the containers to get a sense of the tools available. Most of the SDKs and scripts will have a help page explaining their functionality.

In [None]:
# Have a look at the files on the SDK container
!$x_sdk ls

In [None]:
# Make GET requests from the SDK container to the ai-engine
# Check ai-engine (triton) is ready for requests
!$x_sdk curl -v ai-engine:8000/v2/health/ready

In [None]:
# The morpheus SDK help page
!$x_morpheus

In [None]:
# Run one of NVIDIA's shell scripts in the broker pod
!$x_broker kafka-topics.sh

In [None]:
# The MLFlow SDK help page
!$x_mlflow

## Kafka

The other way to run a pipeline with Morpheus is to connect it to kafka topics for input and output. To do this we will need to create these topics using one of NVIDIA's provided scripts.

In [None]:
# Create an input topic
!$x_broker kafka-topics.sh\
      --create \
      --bootstrap-server broker:9092 \
      --replication-factor 1 \
      --partitions 3 \
      --topic input

In [None]:
# Create an output topic
!$x_broker kafka-topics.sh\
      --create \
      --bootstrap-server broker:9092 \
      --replication-factor 1 \
      --partitions 3 \
      --topic output

In [None]:
# View the topics we've just created
!$x_broker --list  --zookeeper zookeeper:2181

At this point you could write a pipeline to ingest data from the input topic and publish inferences to the . The easiest way to test this pipeline would be to use the `kafka-console-producer.sh` script in the broker container to simulate an input stream. There are several examples of how to do this in the [quick start guide](https://docs.nvidia.com/morpheus/morpheus_quickstart_guide.html#run-nlp-sensitive-information-detection-pipeline).

Since the stream will need to run at the same time as the pipeline you will likely need to open up a terminal to do this. We have left this as an exercise for you if you'd like to explore this feature. Alternatively you could connect these kafka topics to another program which generates data - note that you will likely need to adjust the preprocessing steps so that they match your deployed model's expected format. 

# Next Steps
Using NVIDIA's pretrained models and examples are a good place to start to get a grasp of the framework, but fine tuning is necessary for any model deployed in a production setting.

NVIDIA has provided training scripts and notebooks for each of their models for you to play around with to retrain their models. If their models do not adapt well to your use case you can also look at [hugging face's model repository](https://huggingface.co/models) as a place to start and train your own model.

### WARNING
Note that many of these scripts and notebooks were moved from their original repositories when being compiled into the Morpheus container. This means that any paths present within these scripts and notebooks **will not accurately reflect where they are in the container**. NVIDIA also **has not provided any environment files** to help us recreate the environments used to train these models. The datasets will likely be somewhere in the folder we copied to `/opt/morpheus/common/` earlier. It may be easier to browse the file structure on the [NVIDIA Morpheus GitHub](https://github.com/nv-morpheus/Morpheus/tree/branch-22.09/models). It will be up to you track down the datasets and recreate the environments if you wish to use the tools that NVIDIA have provided as a starting ground. Alternatively you can look through these resources to get inspiration on how you can train your own models to make use of this framework. 

Good luck!

In [None]:
# Copy NVIDIA's training scripts into our root directory so that we can easily access their scripts and notebooks
!cp /opt/morpheus/common/models/training-tuning-scripts/ ./ -r