 # Part 1: MLRun Basics

 Part 1 of the getting-started tutorial introduces you to the basics of working with functions by using the MLRun open-source MLOps orchestration framework.
 
 The tutorial begins with a short [introduction to MLRun](#gs-tutorial-1-mlrun-intro), and then takes you through the following steps:

 1. [Installation and Setup](#gs-tutorial-1-step-setup)
 2. [Creating a basic function and running it locally](#gs-tutorial-1-step-create-basic-function)
 3. [Running the function on the cluster](#gs-tutorial-1-run-function-on-cluster)
 4. [Viewing jobs on the dashboard (UI)](#gs-tutorial-1-step-ui-jobs-view)
 5. [Scheduling jobs](#gs-tutorial-1-step-schedule-jobs)

<a id="gs-tutorial-1-mlrun-intro"></a>

## Introduction to MLRun

[MLRun](https://github.com/mlrun/mlrun) is an open-source MLOps framework that offers an integrative approach to managing your machine-learning pipelines from early development through model development to full pipeline deployment in production.
MLRun offers a convenient abstraction layer to a wide variety of technology stacks while empowering data engineers and data scientists to define the feature and models.

MLRun provides the following key benefits:

- **Rapid deployment** of code to production pipelines
- **Elastic scaling** of batch and real-time workloads
- **Feature management** &mdash; ingestion, preparation, and monitoring
- **Works anywhere** &mdash; your local IDE, multi-cloud, or on-prem

MLRun is available as a default (pre-deployed) shared service in the Iguazio Data Science Platform ("the platform") and is integrated seemlessly with other platform services.

&#x25B6; For more information about MLRun, see the [MLRun Python package documentation](https://mlrun.readthedocs.io).

<a id="gs-tutorial-1-mlrun-basic-components"></a>

### The Basic MLRun Components

MLRun has the following main components:

- <a id="def-project"></a>**Project** &mdash; a container for organizing all of your work on a particular activity.
    Projects consist of metadata, source code, workflows, data and artifacts, models, triggers, and member management for user collaboration.

- <a id="def-function"></a>**Function** &mdash; a software package with one or more methods and runtime-specific attributes (such as image, command, arguments, and environment).

- <a id="def-run"></a>**Run** &mdash; an object that contains information about an executed function.
    The run object is created as a result of running a function, and contains the function attributes (such as arguments, inputs, and outputs), as well the execution status and results (including links to output artifacts).

- <a id="def-artifact"></a>**Artifact** &mdash; versioned data artifacts (such as data sets, files and models) that are produced or consumed by functions, runs, and workflows.

- <a id="def-workflow"></a>**Workflow** &mdash; defines a functions pipeline or a directed acyclic graph (DAG) to execute using [Kubeflow Pipelines](https://www.kubeflow.org/docs/pipelines/pipelines-quickstart/).

- <a id="def-ui"></a>**UI** &mdash; a graphical user interface (dashboard) for displaying and managing projects and their contained experiments, artifacts, and code.

<a id="gs-tutorial-1-step-setup"></a>

## Step 1: Installation and Setup

In the Iguazio Data Science Platform ("the platform"), MLRun is available as a default (pre-deployed) shared service.
For information on how to install and configure MLRun in other environments, see the [MLRun documentation](https://mlrun.readthedocs.io/en/latest/install.html).
Once you have a running MLRun service, you need to set up your development environment to work with this service.

<a id="gs-tutorial-1-install-mlrun-pkg"></a>

### Installing the MLRun Python Package (mlrun)

To use the MLRun Python library, you need to install the `mlrun` Python package in your development environment.
This needs to be done only once, although you might occasionally need to update the package version.
When running on the Iguazio Data Science Platform you can use the provided **align_mlrun.sh** script in your **/User** directory to install the MLRun package or upgrade the version of an installed package.
By default, the script attempts to download the latest version of the MLRun package that matches the version of the running MLRun service.
To manually install the MLRun package, run `pip install mlrun` with the MLRun version that matches your MLRun service.

> **Note:** After installing or updating the MLRun package, restart the notebook kernel in your environment.

In [None]:
!/User/align_mlrun.sh

Both server & client are aligned (0.6.0rc9).


<a id="gs-tutorial-1-import-libraries"></a>

### Importing Libraries

Run the following code to import required libraries:

In [1]:
from os import path
import mlrun

<a id="gs-tutorial-1-mlrun-envr-init"></a>

### Initializing Your MLRun Environment

Use the `set_environment` MLRun method to configure the working environment and default configuration. 
Define a project name and set the `project` parameter, setting the `user_project` flag will add the current user to the project name (avoiding a case of multiple users using the same project name).

You can optionally pass additional parameters to `set_environment`, such as `artifact_path` to override the default path for storing project artifacts, as explained in the next steps, or you can set the remote `api_path` url and `access_key` when using remote mlrun/kubernetes cluster.
`set_environment` returns the current project name and artifacts path url.

#### Defining the Project Name and Setting the MLRun Environment

In [2]:
# Set the project name
project_name_base = 'getting-started-tutorial'
# Initialize your MLRun environment and save the artifacts path
project_name, artifact_path = mlrun.set_environment(project=project_name_base,
                                                    user_project=True)

#### Using MLRun Projects

MLRun projects are used to package multiple runs, functions, workflows, and artifacts. projects will be created when you run a task of save an object (e.g., function, artifact, etc.) to that project. you can use methods such as `new_project` to configure project level metadata and link it to source control (git repository). For more information, refer to [the MLRun documentation](https://mlrun.readthedocs.io/en/release-v0.6.x-latest/projects.html).

#### Configuring the Artifacts Path

You can configure a default MLRun artifacts path.

> Note: In Iguazio platform, the default artifacts path is a **&lt;project name&gt;/artifacts** directory in the predefined "projects" data container &mdash; **/v3io/projects/&lt;project name&gt;/artifacts** (for example, **/v3io/projects/myproject/artifacts** for a "myproject" project).

You can set a custom artifacts path and override the default configuration for your project by setting the `artifact_path` parameter of the `set_environment` method.
You can use variables in the artifacts path, such as `{{run.project}}` for the name of the running project or `{{run.uid}}` for the current run UID.
(The default artifacts path uses `{{run.project}}`.)
The following example configures the artifacts path to the **./artifacts** directory (under the current directory):
```
artifact_path = './artifacts'
set_environment(project=project_name, artifact_path=artifact_path)
```
When you use use `{{run.uid}}`, the artifacts for each job are stored in a dedicated directory for the executed job.
Otherwise, the same artifacts directory is used in all runes, so the artifacts for newer runs override those from the previous runs.

The returned `artifact_path` variable can be used to derive specific sub directories per task, for example by calling
`training_artifacts = os.join(artifact_path, 'training')`.

> Note: the artifact path may be a remote mlrun data uri (e.g. `s3://bucket/path`) and can not be used with file utils.

In [3]:
# Run the following code to display the current project name and artifact path
print(f'Project name: {project_name}')
print(f'Artifacts path: {artifact_path}')

Project name: getting-started-tutorial-admin
Artifacts path: /v3io/projects/{{run.project}}/artifacts


<a id="gs-tutorial-1-set-environment"></a>

<a id="gs-tutorial-1-setup-remote-env"></a>

### Using MLRun Remotely

This tutorial is aimed at running your project from a local Jupyter Notebook service in the same environment in which MLRun is installed and running.
However, as a developer you might want to develop your project from a remote location using your own IDE (such as a local Jupyter Notebook or PyCharm), and connect to the MLRun environment remotely.
To learn how to use MLRun from a remote IDE, see the [MLRun documentation](https://mlrun.readthedocs.io/en/release-v0.6.x-latest/remote.html).

<a id="gs-tutorial-1-step-create-basic-function"></a>

## Step 2: Creating a Basic Function

<a id="gs-tutorial-1-workign-with-functions"></a>

### Working with Functions

An MLRun function is a software package with one or more methods and runtime-specific attributes (such as image, command, arguments, and environment). 
The function code is stored in the MLRun database and can be used for running jobs with a single function or as part of a pipeline.
Each function is stored in the MLRun database with a unique hash code, and gets a new hash code upon changes.
The function specification is saved as a YAML file and can be viewed by using the MLRun API or from the dashboard (UI).

In order to work with functions you need to be familiar with the following function components:

- **Context** &mdash; MLRun introduces a concept of a runtime context object (`context`).
    The code can be set up to get parameters and inputs from the context, as well as log run outputs, artifacts, tags, and time-series metrics in the context.
- **Parameters** &mdash; the parameters (arguments) that are passed to the functions.
- **Inputs** &mdash; MLRun functions have a special `inputs` parameter for passing data objects (such as data sets, models, or files) as input to a function.
    Use this parameter and not custom parameters to pass data items to a function.

#### Example &mdash; Ingesting a File

The following example code reads a CSV file into a pandas DataFrame, categorizes the label column, and returns the dataframe and its length:

In [4]:
import pandas as pd

# Ingest a data set
def prep_data(source_url, label_column):

    df = pd.read_csv(source_url)
    df[label_column] = df[label_column].astype('category').cat.codes    
    return df, df.shape[0]

Now, take this function and run it as an MLRun function, leveraging MLRun for performing the following tasks:

- Reading the data
- Logging the data to the MLRun database

<a id="gs-tutorial-1-working-with-artifacts"></a>

### Working with Artifacts

An MLRun artifact is any data that is produced or consumed by functions or jobs.

The artifacts are stored in the project and are divided into three main types:

- **Data sets** &mdash; any data, such as tables and DataFrames.
- **Plots** &mdash; images, figures, and line plots.
- **Models** &mdash; all trained models.

For detailed information about managing different artifacts, see the [MLRun documentation](https://mlrun.readthedocs.io/en/release-v0.6.x-latest/data-management-and-versioning.html).

<a id="gs-tutorial-1-create-and-run-an-mlrun-function"></a>

### Creating and Running Your First MLRun Function 

To effectively run your code in MLRun, you should first add the `context` parameter. This allows you to log information related to the execution.

We will also set the `source_url` as `mlrun.DataItem`. A data item will be sent as `input` when we call the function.

To run an MLRun function, start out by using the `log_result` context method to record regular values (int, float, string, list, etc.) with your task and use `log_dataset` context method to store and log project artifacts.

The following example reads the data from the CSV file, cleans it, records the data length and logs the data-set artifact, note that you can also use other file formats such as Parquet without modifying the code.

We will later use the MLRun `code_to_function` method to convert your notebook code into an MLRun (project) function &mdash; a function object with embedded code, which can run on the cluster.
This will also come in handy later in the tutorial, after you create an automated pipeline.
To identify the sections that need to be converted, use annotation that begin with `#nuclio`; (don't confuse this with the Nuclio serverless-functions framework; the annotation name is planned to change in future versions).

Use the MLRun `code_to_function` method to convert your notebook code into an MLRun (project) function &mdash; a function object with embedded code, which can run on the cluster.
This will also come in handy later in the tutorial, after you create an automated pipeline.
To identify the sections that need to be converted, use annotation and magic commands that begin with `%nuclio`; (don't confuse this with the Nuclio serverless-functions framework; the annotation name is planned to change in future versions).

The comment annotations (for example,` # nuclio: ignore`) help provide non-intrusive hints as to how you want to convert the notebook into a full function and specification.
Because in the notebook we sometimes add extra cells that shouldn't be included in the actual function (such as prints, plots, tests, and debug code), you can use the `# nuclio: start-code` and `# nuclio: end-code` annotations before and after relevant code cells.
Alternatively, you can add `# nuclio: ignore` at the beginning of code cells that you wish to exclude from the function.
> **Note:** You can use the `nuclio: start-code` and `nuclio: end-code` annotations only once in the same notebook; (only the first use will be selected).

For more information about using the annotations and magic commands, see the [`nuclio-jupyter` documentation](https://github.com/nuclio/nuclio-jupyter#controlling-function-code-and-configuration).

In [5]:
# nuclio: start-code

In [6]:
import mlrun
def prep_data(context, source_url: mlrun.DataItem, label_column='label'):

    # Convert the DataItem to a Pandas DataFrame
    df = source_url.as_df()
    df[label_column] = df[label_column].astype('category').cat.codes    
    
    # record the df length as a result of the run
    context.log_result('num_rows', df.shape[0])

    # Store the data set in your artifacts database
    context.log_dataset('cleaned_data', df=df, index=False, format='csv')

In [7]:
# nuclio: end-code

As input, use a CSV file from a cloud object-store service named wasabisys:

In [8]:
# Set the source-data URL
source_url = 'https://s3.wasabisys.com/iguazio/data/iris/iris.data.raw.csv'

#### Converting the Notebook Code to a Function

The following code converts the code of your local `prep_data` data-ingestion function to a 'data_prep_func' MLRun function.
Note that users can use different engines to run their code, such as a job (Python process), Spark, mpijob, Nuclio, and Dask.
The following example sets the `kind` parameter of the `code_to_function` method to `job` to run the code as a Python job.
Then, it uses `set_function` to save the function object in the project.

In [9]:
# Convert the local get_data function into a gen_func project function
data_prep_func = mlrun.code_to_function(name='prep_data',kind='job', image='mlrun/ml-models')

Next, run your function (`data_prep_func`) locally, as part of the Jupyter pod.
The execution results are stored in the MLRun database.
The example sets the following function parameters:

- `name` &mdash; the job name
- `handler` &mdash; the name of the function handler
- `input` &mdash; the data-set URL
- `project` &mdash; the project name

By default, the artifact (the **iris_dataset** CSV file) is stored in the project's default artifacts path (**/v3io/projects/&;t;project name&gt;/artifacts** when running in the platform).
However, you can you can set the `artifact_path` parameter to a different location to override the default path and store the artifacts in a different location. 

Now, run the function using the MLRun `run` method.
The example sets the `local` parameter to `True` to runs the code locally within the Jupyter pod.
As previously explained, when running locally the function runs within the Jupyter pod, meaning that it uses the environment variables, volumes, and image that are running in this pod.
> **Note:** When running a function locally, the function code is saved only in a temporary local directory and not in your project's ML functions code repository.

In [10]:
prep_data_run = data_prep_func.run(name='prep_data',
                                   handler=prep_data,
                                   inputs={'source_url': source_url},
                                   local=True)

> 2021-01-24 19:10:16,809 [info] starting run prep_data uid=21d88ff9876f4fd192733e4324566e3d DB=http://mlrun-api:8080


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-admin,...24566e3d,0,Jan 24 19:10:16,completed,prep_data,v3io_user=adminkind=owner=adminhost=jupyter-5669bcbf8f-fdfnf,source_url,,num_rows=150,cleaned_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run 21d88ff9876f4fd192733e4324566e3d --project getting-started-tutorial-admin , !mlrun logs 21d88ff9876f4fd192733e4324566e3d --project getting-started-tutorial-admin
> 2021-01-24 19:10:17,229 [info] run executed, status=completed


<a id="gs-tutorial-1-get-run-object-info"></a>

### Getting Information About the Run Object

Every run object, which is returned by the MLRun `run` method, has the following methods:

- `uid` &mdash; returns the unique ID.
- `state` &mdash; returns the last known state.
- `show` &mdash; shows the latest job state and data in a visual widget (with hyperlinks and hints).
- `outputs` &mdash; returns a dictionary of the run results and artifact paths.
- `logs` &mdash; returns the latest logs.
    Use `Watch=False` to disable the interactive mode in running jobs.
- `artifact` &mdash; returns full artifact details for the provided key.
- `output` &mdash; returns a specific result or an artifact path for the provided key.
- `to_dict`, `to_yaml`, `to_json` &mdash; converts the run object to a dictionary, YAML, or JSON format (respectively).

In [11]:
# example
prep_data_run.state()

'completed'

In [12]:
prep_data_run.outputs['cleaned_data']

'store://artifacts/getting-started-tutorial-admin/prep_data_cleaned_data:21d88ff9876f4fd192733e4324566e3d'

<a id="gs-tutorial-1-read-output"></a>

### Reading the Output

The data-set location is returned in the `outputs` field.
Therefore, you can get the location by calling `prep_data_run.outputs['cleaned_data']` and using `get_dataitem` to get the data set itself.

In [13]:
dataset = mlrun.run.get_dataitem(prep_data_run.outputs['cleaned_data'])

You can also get the data as a pandas DataFrame by calling the `dataset.as_df` method:

In [14]:
dataset.as_df()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),label
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,2
146,6.3,2.5,5.0,1.9,2
147,6.5,3.0,5.2,2.0,2
148,6.2,3.4,5.4,2.3,2


<a id="gs-tutorial-1-save-artifcats-in-run-specific-paths"></a>

### Saving the Artifacts in Run-Specific Paths

In the previous steps, each time the function was executed its artifacts were saved to the same directory, overwriting the existing artifacts in this directory.
But you can also select to save the run results (source-data file) to a different directory for each job execution.
This is done by setting the artifacts path and using the unique run-ID parameter (`{{run.uid}}`) in the path.
Now, under the artifact path you should be able to see the source-data file in a new directory whose name is derived from the unique run ID.

In [15]:
out = artifact_path 

prep_data_run = data_prep_func.run(name='prep_data',
                         handler=prep_data,
                         inputs={'source_url': source_url},
                         local=True,
                         artifact_path=path.join(out, '{{run.uid}}'))

> 2021-01-24 19:10:17,849 [info] starting run prep_data uid=2fb2eb6b79b647a1aefa06130bacbd93 DB=http://mlrun-api:8080


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-admin,...0bacbd93,0,Jan 24 19:10:17,completed,prep_data,v3io_user=adminkind=owner=adminhost=jupyter-5669bcbf8f-fdfnf,source_url,,num_rows=150,cleaned_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run 2fb2eb6b79b647a1aefa06130bacbd93 --project getting-started-tutorial-admin , !mlrun logs 2fb2eb6b79b647a1aefa06130bacbd93 --project getting-started-tutorial-admin
> 2021-01-24 19:10:18,154 [info] run executed, status=completed


<a id="gs-tutorial-1-step-run-func-on-cluster"></a>

<a id="gs-tutorial-1-run-function-on-cluster"></a>

## Step 3: Running the Function on a Cluster

You can also run MLRun functions on the cluster itself, as opposed to running them locally in the Jupyter pod, as done in the previous steps.
Running a function on the cluster allows you to leverage the cluster's resources and run a more resource-intensive workloads.
MLRun helps you to easily run your code without the hassle of creating configuration files and build images.
To run an MLRun function on a cluster, just change the value of the `local` flag in the call to the `run` method to `False`.

In [16]:
from mlrun import mount_v3io

In [17]:
data_prep_func.apply(mount_v3io())
prep_data_run = data_prep_func.run(name='prep_data',
                                   handler='prep_data',
                                   inputs={'source_url': source_url},
                                   local=False)

> 2021-01-24 19:10:18,167 [info] starting run prep_data uid=11306e62a4674764a9197c8a4975b07f DB=http://mlrun-api:8080
> 2021-01-24 19:10:18,295 [info] Job is running in the background, pod: prep-data-ngkmb
> 2021-01-24 19:10:22,410 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-admin,...4975b07f,0,Jan 24 19:10:22,completed,prep_data,v3io_user=adminkind=jobowner=adminhost=prep-data-ngkmb,source_url,,num_rows=150,cleaned_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run 11306e62a4674764a9197c8a4975b07f --project getting-started-tutorial-admin , !mlrun logs 11306e62a4674764a9197c8a4975b07f --project getting-started-tutorial-admin
> 2021-01-24 19:10:24,403 [info] run executed, status=completed


In [18]:
print(prep_data_run.outputs)

{'num_rows': 150, 'cleaned_data': 'store://artifacts/getting-started-tutorial-admin/prep_data_cleaned_data:11306e62a4674764a9197c8a4975b07f'}


<a id="gs-tutorial-1-step-ui-jobs-view"></a>

## Step 4: Viewing Jobs on the Dashboard (UI)

On the **Projects** dashboard page, select your project and then navigate to the project's jobs and workflow page by selecting the relevant link.
For this tutorial, after running the `prep_data` method twice, you should see three records with types local (**&lt;&gt;**) and job.
In this view you can track all jobs running in your project and view detailed job information.
Select a job name to display tabs with additional information such as an input data set, artifacts that were generated by the job, and execution results and logs. 

<img src="./images/Jobs.jpg" alt="Jobs" width="800"/>

<a id="gs-tutorial-1-step-schedule-jobs"></a>

## Step 5: Scheduling Jobs

To schedule a job, you can set the `schedule` parameter of the `run` method.
The scheduling is done by using a crontab format.

You can also schedule jobs from the dashboard: on the jobs and monitoring project page, you can create a new job using the **New Job** wizard.
At the end of the wizard flow you can set the job scheduling.
In the following example, the job is set to run every 30 minutes.

In [19]:
data_prep_func.apply(mount_v3io())
prep_data_run = data_prep_func.run(name='prep_data',
                                   handler='prep_data',
                                   inputs={'source_url': source_url},
                                   local=False,
                                   schedule='*/30 * * * *')

> 2021-01-24 19:10:24,418 [info] starting run prep_data uid=100fa88ccc8345b7a5957a93780a495b DB=http://mlrun-api:8080
> 2021-01-24 19:10:24,493 [info] task scheduled, {'schedule': '*/30 * * * *', 'project': 'getting-started-tutorial-admin', 'name': 'prep_data'}


In [20]:
print(mlrun.get_run_db().list_schedules(project_name))

schedules=[ScheduleOutput(name='prep_data', kind=<ScheduleKinds.job: 'job'>, scheduled_object={'task': {'spec': {'inputs': {'source_url': 'https://s3.wasabisys.com/iguazio/data/iris/iris.data.raw.csv'}, 'output_path': '/v3io/projects/getting-started-tutorial-admin/artifacts', 'function': 'getting-started-tutorial-admin/prep-data@3fcbd1ced979bc6224d31ea3a84b1be95f75844c', 'secret_sources': [], 'scrape_metrics': False, 'handler': 'prep_data'}, 'metadata': {'uid': '100fa88ccc8345b7a5957a93780a495b', 'name': 'prep_data', 'project': 'getting-started-tutorial-admin', 'labels': {'v3io_user': 'admin', 'kind': 'job', 'owner': 'admin'}, 'iteration': 0}, 'status': {'state': 'created'}}, 'schedule': '*/30 * * * *'}, cron_trigger=ScheduleCronTrigger(year=None, month='*', day='*', week=None, day_of_week='*', hour='*', minute='*/30', second=None, start_date=None, end_date=None, timezone=None, jitter=None), desired_state=None, labels={'v3io_user': 'admin', 'kind': 'job', 'owner': 'admin'}, creation_ti

In [21]:
mlrun.get_run_db().get_schedule(project_name, 'prep_data')

ScheduleOutput(name='prep_data', kind=<ScheduleKinds.job: 'job'>, scheduled_object={'task': {'spec': {'inputs': {'source_url': 'https://s3.wasabisys.com/iguazio/data/iris/iris.data.raw.csv'}, 'output_path': '/v3io/projects/getting-started-tutorial-admin/artifacts', 'function': 'getting-started-tutorial-admin/prep-data@3fcbd1ced979bc6224d31ea3a84b1be95f75844c', 'secret_sources': [], 'scrape_metrics': False, 'handler': 'prep_data'}, 'metadata': {'uid': '100fa88ccc8345b7a5957a93780a495b', 'name': 'prep_data', 'project': 'getting-started-tutorial-admin', 'labels': {'v3io_user': 'admin', 'kind': 'job', 'owner': 'admin'}, 'iteration': 0}, 'status': {'state': 'created'}}, 'schedule': '*/30 * * * *'}, cron_trigger=ScheduleCronTrigger(year=None, month='*', day='*', week=None, day_of_week='*', hour='*', minute='*/30', second=None, start_date=None, end_date=None, timezone=None, jitter=None), desired_state=None, labels={'v3io_user': 'admin', 'kind': 'job', 'owner': 'admin'}, creation_time=datetime

<img src="./images/func-schedule.JPG" alt="scheduled-jobs" width="1400"/>

> **Note:** Don't forget the remove the scheduled job.

Delete the job.

In [22]:
mlrun.get_run_db().delete_schedule(project_name, 'prep_data')

Verify that the scheduled job has been deleted

In [23]:
#mlrun.get_run_db().get_schedule(project_name,'prep_data')

## Done!

Congratulation! You've completed Part 1 of the MLRun getting-started tutorial.
Proceed to [Part 2](02-model-training.ipynb) to learn how to train an ML model.