# Using MLRUN function locally, as a Kubernetes Job, and in a Workflow
  --------------------------------------------------------------------

#### **notebook how-to's**
* Write and test code in a notebook.
* Convert it to a containerized image.
* Run it on a Kubernetes cluster with shared file or object storage.
* Run it in an automated workflow.

<a id='top'></a>
#### **steps**
**[intall mlrun](#install)**<br>
**[define a new function and its dependencies](#define-function)**<br>
**[test the function code and pipeline locally](#test-locally)**<br>
**[define cluster jobs and build images](#build)**<br>
**[deploy (build) the function container](#deploy-build)**<br>
**[run the function on the cluster](#run-on-cluster)**<br>
**[create and run a KubeFlow Pipeline](#create-pipeline)**<br>

<a id="install" ></a>
______________________________________________
### **install mlrun**

In [1]:
# Uncomment this to install mlrun package, restart the kernel after

# !pip install -U git+https//github.com/mlrun/mlrun.git@development

______________________________________________

<a id='define-function'></a>
### **define a new function and its dependencies**

In [2]:
# nuclio: ignore
# do not remove the comment above (it is a directive to nuclio, ignore that cell during build)
# if the nuclio-jupyter package is not installed run !pip install nuclio-jupyter and restart the kernel 
import nuclio 

We use `%nuclio` magic commands to set package dependencies and configuration:

In [3]:
%nuclio cmd -c pip install pandas
%nuclio config spec.build.baseImage = "python:3.6-jessie"

%nuclio: setting spec.build.baseImage to 'python:3.6-jessie'


The ```DataItem```s and the ```context``` within which they are logged are described in the following ```mlrun``` modules (they are included here only for type clarity).

In [7]:
from mlrun.execution import MLClientCtx
from mlrun.datastore import DataItem

In [8]:
def training(
    context: MLClientCtx,
    p1: int = 1,
    p2: int = 2
) -> None:
    """Train a model.

    :param context: The runtime context object.
    :param p1: A model parameter.
    :param p2: Another model parameter.
    """
    # access input metadata, values, and inputs
    print(f'Run: {context.name} (uid={context.uid})')
    print(f'Params: p1={p1}, p2={p2}')
    context.logger.info('started training')
    
    # <insert training code here>
    
    # log the run results (scalar values)
    context.log_result('accuracy', p1 * 2)
    context.log_result('loss', p1 * 3)
    
    # add a lable/tag to this run 
    context.set_label('category', 'tests')
    
    # log a simple artifact + label the artifact 
    # If you want to upload a local file to the artifact repo add src_path=<local-path>
    context.log_artifact('model', 
                          body=b'abc is 123', 
                          target_path='model.txt', 
                          labels={'framework': 'tfkeras'})

In [9]:
def validation(
    context: MLClientCtx,
    model: DataItem
) -> None:
    """Model validation.
    
    Dummy validation function.
    
    :param context: The runtime context object.
    :param model: The extimated model object.
    """
    # access input metadata, values, files, and secrets (passwords)
    print(f'Run: {context.name} (uid={context.uid})')
    print(f'file - {model.url}:\n{model.get()}\n')
    context.logger.info('started validation')    
    context.log_artifact('validation', 
                         body=b'<b> validated </b>', 
                         target_path='validation.html',
                         viewer='web-app')

The following end-code annotation tells ```nuclio``` to stop parsing the notebook from this cell. _**Please do not remove this cell**_:

In [10]:
# nuclio: end-code

______________________________________________

<a id='test-locally'></a>
### **test the function code and pipeline locally**
The functions above can be tested locally. Parameters, inputs, and outputs can be specified in the API or the `Task` object.

We create a ```function``` which defines the runtime environment (type, code, image, ..) and ```run()``` a job or experiments using that function.

We use the ```local``` runtime by default, later on we will use a ```job``` runtime for running containers, and can use other distributed runners like MpiJob, Spark, Dask, and Nuclio.

In each run we can specify the function, inputs, parameters/hyper-parameters, etc... For more details, see the [mlrun_basics notebook](mlrun_basics.ipynb).

In [11]:
from mlrun import new_function, code_to_function, mlconf, NewTask, mount_v3io

In [12]:
# set mlrun db/api path (can also be specified in mlrun.mlconf)
# %env MLRUN_DBPATH=http://<mlrun-api-url>:8080
        
# set the UI external URL (will generate ui hyperlinks)
# %env MLRUN_UI_URL=http://<mlrun-ui-url>:<port>

#%env MLRUN_DBPATH='/User/mlrun'
mlconf.dbpath = '/User/mlrun'

#### _running and linking multiple tasks_
In this example we run two functions, ```training``` and ```validation``` and we pass the result from one to the other.
We will see in the ```job``` example that linking works even when the tasks are run in a workflow on different processes or containers.

```new_function()``` will create a local function object:

In [13]:
newfn = new_function()

Run the training function. Functions can have multiple handlers/methods, here we call the ```training``` handler:

In [14]:
train_run = newfn.run(handler=training, params={'p1': 5})

Run: training (uid=4a04fe2df59f45f28ff2d80fbb84d153)
Params: p1=5, p2=2
[mlrun] 2019-12-19 22:07:55,504 started training

Run: training (uid=4a04fe2df59f45f28ff2d80fbb84d153)
Params: p1=5, p2=2
[mlrun] 2019-12-19 22:07:55,504 started training



uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
...84d153,0,Dec 19 22:07:55,completed,training,host=jupyter-qlqrqnzi25-vogv2-79db4f79d-gn7c2category=tests,,p1=5,accuracy=10loss=15,model


to track results use .show() or .logs() or in CLI: 
!mlrun get run 4a04fe2df59f45f28ff2d80fbb84d153  , !mlrun logs 4a04fe2df59f45f28ff2d80fbb84d153 
[mlrun] 2019-12-19 22:07:55,628 run executed, status=completed


After the function runs it generates the result widget, you can click the `model` artifact to see its content.

In [15]:
train_run.outputs

{'accuracy': 10, 'loss': 15, 'model': 'model.txt'}

The output from the first training function is passed to the validation function, let's run it:

In [16]:
model_path = train_run.outputs['model']

validation_run = newfn.run(handler=validation, inputs={'model': model_path})

Run: validation (uid=7eae408f79b9450182280fd993fe83ce)
file - model.txt:
b'abc is 123'

[mlrun] 2019-12-19 22:07:58,002 started validation

Run: validation (uid=7eae408f79b9450182280fd993fe83ce)
file - model.txt:
b'abc is 123'

[mlrun] 2019-12-19 22:07:58,002 started validation



uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
...fe83ce,0,Dec 19 22:07:57,completed,validation,host=jupyter-qlqrqnzi25-vogv2-79db4f79d-gn7c2,model,,,validation


to track results use .show() or .logs() or in CLI: 
!mlrun get run 7eae408f79b9450182280fd993fe83ce  , !mlrun logs 7eae408f79b9450182280fd993fe83ce 
[mlrun] 2019-12-19 22:07:58,068 run executed, status=completed


______________________________________________

<a id="build"></a>
### **define cluster jobs and build images**

In order to use our function in a cluster we need to package our code and dependencies.

The ```code_to_function``` call will automatically generate a ```function``` object from the current notebook (or a specified file) with its list of dependencies and runtime configuration.

In [17]:
# create an ML function from the notebook, attache it to iguazio data fabric (v3io)
trainer = code_to_function(name='my-trainer', runtime='job')

The functions need shared storage (file or object) media to pass and store artifacts.

You can add _**Kubernetes**_ resources like volumes, environment variables, secrets, cpu/mem/gpu, etc. to a function.

```mlrun``` uses _**KubeFlow**_ modifiers (apply) to configure resources, you can build your own or use predefined ones e.g. for [AWS resources](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/aws.py).


##### _**Option 1: Using Iguazio data fabric for artifacts**_
If your are using [Iguazio data science platform](https://www.iguazio.com/) use the `mount_v3io()` modifier.

Applying ```mount_v3io()``` will attach the function to Iguazio's real-time data fabric (mounted by default to _**home**_ of the current user).

**Note**: if the notebook is not on the managed platform (running remotely) you need to create and use a v3io secret, run:

`kubectl create -n <namespace> secret generic my-v3io --from-literal=accessKey=<your access key> --from-literal=username=<your user name> --type v3io/fuse`

and use: `trainer.apply(mount_v3io(user='admin', secret='my-v3io'))`.

So for our current ```training``` function, when using Iguazio data science platform run:

In [18]:
trainer.apply(mount_v3io())

# location of the artifacts
output_path = '/User/test'

##### _**Option 2: Using AWS S3 for artifacts**_

In AWS you can use S3 and need to have a `secret` with AWS credentials. An AWS secret can be created with the following command line:

`kubectl create -n <namespace> secret generic my-aws --from-literal=AWS_ACCESS_KEY_ID=<access key> --from-literal=AWS_SECRET_ACCESS_KEY=<secret key>`

To use the secret:

In [19]:
# from kfp.aws import use_aws_secret

In [20]:
# trainer.apply(use_aws_secret(secret_name='my-aws'))
# output_path = 's3://<your-bucket-name>/jobs'

______________________________________________

<a id="deploy-build"></a>
### **deploy (build) the function container**

The `deploy()` command will build a custom container image (create a cluster build job) from the outlined function dependencies.

If a pre-built container image already exists, pass the `image` name instead. _**Note that the code and params can be updated per run without building a new image**_.

The image is stored in a container repository, and by default it uses the repository configured on the MLRun API service, you can specify your own docker registry by first creating a secret, and adding that secret name to the build configuration:

`kubectl create -n <namespace> secret docker-registry my-docker --docker-server=https://index.docker.io/v1/ --docker-username=<your-user> --docker-password=<your-password> --docker-email=<your-email>`

and run this: `trainer.build_config(image='target/image:tag', secret='my_docker')`

In [21]:
trainer.deploy(watch=True)

[mlrun] 2019-12-19 22:08:08,349 building image (.mlrun/func-default-my-trainer-latest)
FROM python:3.6-jessie
WORKDIR /run
RUN pip install pandas
RUN pip install mlrun
ENV PYTHONPATH /run
[mlrun] 2019-12-19 22:08:08,352 using in-cluster config.
[mlrun] 2019-12-19 22:08:08,373 Pod mlrun-build-my-trainer-ntk4d created
..
[36mINFO[0m[0000] Resolved base name python:3.6-jessie to python:3.6-jessie 
[36mINFO[0m[0000] Resolved base name python:3.6-jessie to python:3.6-jessie 
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0000] Error while retrieving image from cache: getting file info: stat /cache/sha256:0318d80cb241983eda20b905d77fa0bfb06e29e5aabf075c7941ea687f1c125a: no such file or directory 
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0000] Built cross stage deps: map[]                
[36mINFO[0m[0000] Downloading base image python:3.6-jessie     
[36mINFO[0m[0000] Error while retrieving image from cache: gett

True

______________________________________________

<a id="run-on-cluster"></a>
### **run the function on the cluster**


In case we made changes to the code, ```with_code``` will inject the latest code into the function (it doesn't require a new build).

In [22]:
trainer.with_code()

<mlrun.runtimes.kubejob.KubejobRuntime at 0x7efcbbbfd160>

In [23]:
# create the base task (common to both steps)
base_task = NewTask(out_path=output_path).set_label('stage', 'dev')

In [24]:
# run our training task, with hyper params, and select the one with max accuracy
train_task = NewTask(name='my-training', handler='training', params={'p1': 9}, base=base_task)
train_run = trainer.run(train_task, watch=True)

[mlrun] 2019-12-19 22:09:43,064 starting run my-training uid=a77d7d831f854e5cbcce9c93f0ea4c70  -> /User/mlrun
[mlrun] 2019-12-19 22:09:43,093 using in-cluster config.
[mlrun] 2019-12-19 22:09:43,106 Pod my-training-djsjt created
............................................................................................................................................................................................................................................................................................................

uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
...ea4c70,0,Dec 19 22:09:43,completed,my-training,stage=devkind=jobowner=admin,,p1=9,,


to track results use .show() or .logs() or in CLI: 
!mlrun get run a77d7d831f854e5cbcce9c93f0ea4c70  , !mlrun logs a77d7d831f854e5cbcce9c93f0ea4c70 
[mlrun] 2019-12-19 22:19:45,098 run executed, status=completed


In [25]:
train_run.outputs

{}

In [27]:
# running validation, use the model result from the previos step 
model_path = train_run.outputs['model']
trainer.run(base_task, handler='validation', inputs={'model': model_path}, watch=True)

______________________________________________

<a id="create-pipeline"></a>
### **create and run a KubeFlow pipeline**

KubeFlow pipelines are used for workflow automation--we compose a graph of functions and specify parameters, inputs and outputs.

As ilustrated below, we can chain the outputs and inputs of the pipeline steps.

In [28]:
import kfp
from kfp import dsl

In [29]:
kfp_client = kfp.Client(namespace='default-tenant')

Pipeline results are stored at the following location:

In [30]:
artifacts_path = output_path

However, by adding ```/{{workflow.uid}}``` to the path ```mlrun``` will generate a unique folder per workflow.

In [31]:
@dsl.pipeline(
    name = 'job test',
    description = 'demonstrating mlrun usage'
)
def job_pipeline(
   p1: int = 9
) -> None:
    """Define our pipeline.
    
    :param p1: A model parameter.
    """
    task = NewTask(out_path=output_path, outputs=['model']).with_params(p1=p1)

    train = trainer.as_step(handler='training',
                            out_path=artifacts_path, 
                            params={'p1': p1},
                            outputs=['model'])
    
    validate = trainer.as_step(handler='validation',
                               out_path=artifacts_path, 
                               inputs={'model': train.outputs['model']},
                               outputs=['validation'])
    

The job pipeline can compiled to a yaml file that can be used for debugging:

In [32]:
kfp.compiler.Compiler().compile(job_pipeline, 'jobpipe.yaml')



#### running the function

In [33]:
arguments = {'p1': 8}
run_result = kfp_client.create_run_from_pipeline_func(job_pipeline, arguments, experiment_name='my-job')

[top](#top)