# Tutorial 1 - MLRun Basics

## Tutorial intro

The best way to learn how to work with Iguazio is by going through hands-on tutorials that covers the fundamental  of working with Iguazio and demonstrate how to <br>
create an end to end machine learning pipeline all the way from collecting data, analyze it, train model, deploy models , monitor model and create an automated pipeline out of it <br>

Here are 5 tutorials along with their content:<br>
note that each tutorial rely on the previous one.
1. MLRun Basics
2. Model training
3. Model deployment
4. Create an automated pipeline
5. Working with CI/CD 


In this first tutorial you will learn the basics of working with functions in Iguazio using an open source tool called MLRun. <br>
The tutorial starts with some general overview and then takes you through the following steps: <br>

* Step 1: Settings and basic configuration
* Step 2: Create a basic function and run it locally
* Step 3: Run the function on the cluster
* Step 4: View functions in the UI
* Step 5: Schedule functions

# Overview

## What is MLRun

MLRun is an end-to-end open source MLOps solution to manage and automate your entire analytics and machine learning lifecycle, from data ingestion through model development and full pipeline deployment. <br>
MLRun is running as a built-in service in Iguazio and is integrated well with other services in the platform. <br>
It's primary goal is to ease the development of machine learning pipeline at scale and help organization to build a robust process for moving from the research phase to a full operational production.

## Challange

As an ML developer or data scientist, you typically want to write code in your preferred local development environment (IDE) or web notebook, and then run the same code on a larger cluster using scale-out containers or functions. When you determine that the code is ready, you or someone else need to transfer the code to an automated ML workflow (for example, using Kubeflow Pipelines). This pipeline should be secured and include capabilities such as logging and monitoring, as well as allow adjustments to relevant components and easy redeployment.

However, the implementation is challenging: various environments (“runtimes”) use different configurations, parameters, and data sources. In addition, multiple frameworks and platforms are used to focus on different stages of the development life cycle. This leads to constant development and DevOps/MLOps work.

Furthermore, as your project scales, you need greater computation power or GPUs, and you need to access large-scale data sets. This cannot work on laptops. You need a way to seamlessly run your code on a remote cluster and automatically scale it out.

## Why MLRun ?

When ML running experiments, you should ideally be able to record and version your code, configuration, outputs, and associated inputs (lineage), so you can easily reproduce and explain your results. The fact that you probably need to use different types of storage (such as files and AWS S3 buckets) and various databases, further complicates the implementation.

Wouldn’t it be great if you could write the code once, using your preferred development environment and simple “local” semantics, and then run it as-is on different platforms? Imagine a layer that automates the build process, execution, data movement, scaling, versioning, parameterization, outputs tracking, and more. A world of easily developed, published, or consumed data or ML “functions” that can be used to form complex and large-scale ML pipelines.

In addition, imagine a marketplace of ML functions that includes both open-source templates and your internally developed functions, to support code reuse across projects and companies and thus further accelerate your work.

## Basic components

MLRun has the following main components, which are usually grouped into “projects”:

* **Project** — a container for all your work on a particular activity. All the associated code, jobs and artifacts are organized within the projects. Projects consist of metadata, source code, workflows, data & artifacts, models, triggers and member management for user collaboration.

* **Function** — a software package with one or more methods and runtime-specific attributes (such as image, command, arguments, and environment).

* **Run** — contains information about an executed function. The run object is created as a result of running a function, and it has attributes such as run parameters, inputs, outputs etc with the  addition of the execution status and results (including links to output artifacts).

* **Artifact** — versioned data artifacts (such as datasets, files and models) that are produced or consumed by functions, runs, and workflows.

* **Workflow** — defines a functions pipeline or a directed acyclic graph (DAG) to execute using Kubeflow Pipelines.

* **UI** - displaying and managing all experiments, artifacts and code under their project

## Step 1: Setup

### Setting MLrun 

MLRun is a built-in service in Iguazio. in order to start working with it you need to run the imports below and set up the following: <br>
* MLRun database - set it to the URL of the MLRun database/API serivce. The URL of the service can be taken from the services screen in the platform <br>
* Artifact path - In order to store the artifacts you need to set the artifact_path to the desired root folder of your artifacts. <br>
Artifacts from each run are stored in the artifact_path which can be set globally through environment var (MLRUN_ARTIFACT_PATH) or through the config, if its not already set we can create a directory and use it in our runs.  <br>
Using {{run.uid}} in the path will allow us to create a unique directory per run. <br>
You can use {{run.project}} to include the project name in the path. <br>
Later on when you run your jobs you can use {{run.uid}} to include the specific run uid in the artifact path. <br>
if you don't do that then the jobs artifact will overwrite the old ones. <br>
if you want to store a new artifact every time you run your job then use run.uid <br>

### Pre-requisite

If you already installed MLRun then ignore this step. if not, then run the script below <br>
The script run pip install mlrun with the mlrun version that is aligned to the mlrun service running in the platform

In [1]:
!/User/align_mlrun.sh

Both server & client are aligned (0.6.0rc9).


> **Note:**  Restart jupyter kernel after running the above script

Run imports

In [2]:
from os import environ, path
from mlrun import mlconf

### Artifacts setting

By defualt all artifacts are stored under data container called "Projects" and the "project name" folder <br>
You can change the artifact path by setting a different path: <br>
`mlconf.artifact_path = '/v3io/projects/<new name>'`

Show the artifact path

In [3]:
mlconf.artifact_path

'/v3io/projects/{{run.project}}/artifacts'

### Working from remote 
This tutorial showcases how to run your project in this built-in jupyter service, however, as a developer you may want to develop from <br>
remote using your own IDE (local jupyter, pycharm etc..). <br>
Go to this link to learn how to work with a remote IDE <br>
https://mlrun.readthedocs.io/en/latest/remote.html

### Setting your project

Projects in the platform are used to package multiple functions, workflows, and artifacts.
Projects are created by using the `new_project` MLRun method, which receives the following parameters:

- **`name`** (Required) &mdash; the project name.
- **`context`** &mdash; the path to a local project directory (the project's context directory).
  The project directory contains a project-configuration file (default: **project.yaml**), which defines the project, and additional generated Python code.
  The project file is created when you save your project (using the `save` MLRun project method or when saving your first function within the project).
- **`functions`** &mdash; a list of functions objects or links to function code or objects.
- **`init_git`** &mdash; set to `True` to perform Git initialization of the project directory (`context`).
  > **Note:** It's customary to store project code and definitions in a Git repository.

Projects are visible in the MLRun dashboard only after they're saved to the MLRun database, which happens whenever you run code for a project.

The following code creates a project named "getting-started-iris-&lt;V3IO_USERNAME&gt;", where **&lt;V3IO_USERNAME&gt;** is your current running username in the platform, and sets the project directory to a **conf** directory in the current tutorial directory (**/User/getting-started-tutorial/conf**).

> **Note:** Platform projects are shared among all users of the parent tenant, to facilitate collaboration. Therefore,
>
> - Synchronize your projects execution with other users on your platform cluster, as needed, or use unique project names to avoid conflicts.
>   You can easily change the default project name for this tutorial by changing the definition of the `project_name` variable in the following code.
> - Don't include in your project proprietary information that you don't want to expose to other users.
>   Note that while projects are a useful tool, you can easily develop and run code in the platform without using projects.

In [4]:
from os import path, getenv
from mlrun import new_project

project_name = '-'.join(filter(None, ['getting-started-tutorial', getenv('V3IO_USERNAME', None)])).lower()
project_path = path.abspath('conf')
project = new_project(project_name, project_path, init_git=True)

print(f'Project path: {project_path}\nProject name: {project_name}')
print(f'Artifacts path: {mlconf.artifact_path}')

Project path: /User/nd/demos/getting-started-tutorial/conf
Project name: getting-started-tutorial-orz
Artifacts path: /v3io/projects/{{run.project}}/artifacts


<a id="step-2"></a>
## Step 2: Create a basic function

### Working with Functions 

MLRun Function is a software package with one or more methods and runtime-specific attributes (such as image, command, arguments, and environment). <br> 
The MLrun function code is being stored in the MLRun database and can be used for running jobs with a single function or as part of a pipeline. <br>
Each function is stored in the mlrun database with a unique hash code and gets a new hash code upon changes. <br>
The function spec is kept as a YAML file and can be viewed by an API or via the UI.<br>

In order to work with functions we need to be familiar with the following:

* Context - MLRun introduces a concept of a runtime "context": the code can be set up to get parameters and inputs from the context, as well as log run outputs, artifacts, tags, and time-series metrics in the context.

* Parameters - the arguments that are sent to the functions.

* Input - input is yet another argument called "input", however, the input is used to get data object like dataset ,  model or file. <br>
For sending data items to a function, users should send it via “inputs” and not as params.


Let’s take a simple scenario. First you have some code that reads either a csv file or parquet and returns a DataFrame.

In [5]:
import pandas as pd

# Ingest a data set into the platform
def get_data(source_url):

    if source_url.endswith(".csv"):
        df = pd.read_csv(source_url)
    elif source_url.endswith(".parquet") or source_url.endswith(".pq"):
        df = pd.read_parquet(source_url)
    else:
        raise Exception(f"file type unhandled {source_url}")

    return df

Now, let's take this function and run it as an MLrun function. MLrun will be used for the following:
* Have MLRun handle the data read
* Log this data to the MLRun database


### Working with artifacts

An artifact is any data that is produced and/or consumed by functions or jobs.

The artifacts are stored in the project and are divided to 3 main types:

* Datasets — any data , such as tables and DataFrames.

* Plots — images, figures, and plotlines.

* Models — all trained models.

For detailed information about managing different artifacts go to https://mlrun.readthedocs.io/en/latest/data-management-and-versioning.html


### Create and run your first MLrun function 

For this purpose, we’ll add a context parameter which will be used to log our artifacts. In addition, we are also saving the file <br>  as csv using format = 'csv', yet we can use other file formats such as parquet.
Our code will now look as follows:

In our example we are using the dataset artifact:

In [6]:
def get_data(context, source_url, format='csv'):

    df = source_url.as_df()

    # Store the data set in your artifacts database
    context.log_dataset('source_data', df=df, format=format,
                        index=False)

As input, we will provide a CSV file from a cloud object store service called wasabisys:

In [7]:
# Set the source-data URL
source_url = 'https://s3.wasabisys.com/iguazio/data/iris/iris_dataset.csv'

##### Convert the Notebook Code Into a Function

Use the MLRun `code_to_function` method to convert your notebook code into a project MLRun function &mdash; a function object with embedded code, which can run on the cluster. <br>
This will also come in handy later in the tutorial, after you create an automated pipeline. <br>
In order to identify the sections that need to be converted we are using annotation and magic commands that starts with `%nuclio` (don't confuse it with the nuclio serverless function framework, we are planning to change the name in future versions).  
<br>
`%nuclio` magic commands and some comment annotations  (e.g.` # nuclio: ignore`) help us provide non-intrusive hints as to how we want to convert the notebook into a full function + spec.  
Since in the notebook we sometimes add extra cells which should not be included in the actual function (e.g. prints, plots, tests, debug code, etc.) we can set `# nuclio: start-code` and `# nuclio: end-code` annotations before and after our relevant code cells.  or we can simply add `# nuclio: ignore` to the top of the cells we wish not to include.  
>Note that you can use the `nuclio: start-code` and `nuclio: end-code` annotations only once in the same notebook.  <br>  

If we want settings such as environment variables and package installations to automatically appear in the function spec we can use the `%nuclio env` or `%nuclio cmd` commands and those will copy themself into the function spec. <br>

For more information about using the annotations and magic command go to https://github.com/nuclio/nuclio-jupyter#controlling-function-code-and-configuration <br>

In [8]:
# nuclio: start-code

In [9]:
%nuclio config spec.image = "mlrun/ml-models"

%nuclio: setting spec.image to 'mlrun/ml-models'


In [10]:
def get_data(context, source_url, format='csv'):

    df = source_url.as_df()

    # Store the data set in your artifacts database
    context.log_dataset('source_data', df=df, format=format,
                        index=False)

In [11]:
# nuclio: end-code

The following code converts the code of your local 'get-data' data-ingestion function into a 'gen_func' project function. <br>
Note that users can use different engines to run their code such as job (python process), spark, mpijob, nuclio and dask. <br>
In the example below we use kind=job as we'd like to run it as a python job <br> 
Then, we use set_function to save the function object in the project.

In [12]:
from mlrun import code_to_function

# Convert the local get_data function into a gen_func project function
gen_data_func = code_to_function(name='get_data',kind='job')
project.set_function(gen_data_func)

<mlrun.runtimes.kubejob.KubejobRuntime at 0x7fd04bcb7490>

Next,  call this function localy, meaning that it would  be running as part of the jupyter pod. <br>
The execution results is stored in the MLRun database. <br>
In this example we are using the following parameters: <br>
* name = jobs name 
* handler = function handler as stated above
* input = url of the dataset
* project = project name <br>

By default the artifact (iris_dataset csv file) will be stored in the project default artifact (in our case it's /v3io/projects/project name]) <br>
users can change that by using the artifact_path parameter to store the artifact in a different place 

Now let's run the function using the "run" method. By using local=True it runs the code "localy" as part of the Jupyter pod. <br>
When running localy the function runs within the jupyter pod meaning that it uses the envrieonment variables, volumes and image that are running in this pod.  
> Note that when running a "local" function, the function code is only saved in a temp folder and won't be saved in your project's ML Functions code repo.


In [13]:
get_data_run = gen_data_func.run(name='get_data',
                                 handler='get_data',
                                 inputs={'source_url': source_url},
                                 local=True)

> 2021-01-10 14:30:14,797 [info] starting run get_data uid=6272adfe98494221b2cd81dc5331faad DB=http://mlrun-api:8080


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-orz,...5331faad,0,Jan 10 14:30:14,completed,get_data,v3io_user=orzkind=owner=orzhost=jupyter-orz-6fd46d5d99-t2d5w,source_url,,,source_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run 6272adfe98494221b2cd81dc5331faad --project getting-started-tutorial-orz , !mlrun logs 6272adfe98494221b2cd81dc5331faad --project getting-started-tutorial-orz
> 2021-01-10 14:30:15,246 [info] run executed, status=completed


### Getting info on the "run" object

Every run object (the result of a .run() method) has the following properties and methods:

`.uid()` - return the unique id <br>
`.state()` - return the last known state <br>
`.show()` - show the latest task state and data in a visual widget (with hyper links and hints) <br>
`.outputs` - return a dict of the run results and artifact paths <br>
`.logs()` - return the latest logs, use Watch=False to disable interactive mode in running tasks <br>
`.artifact(key)` - return full artifact details <br>
`.output(key)` - return specific result or artifact (path) <br>
`.to_dict()`, `.to_yaml()`, `.to_json()` - convert the run object to dict/yaml/json

In [14]:
# example
get_data_run.state()

'completed'

In [15]:
get_data_run.outputs['source_data']

'store://getting-started-tutorial-orz/get_data_source_data#6272adfe98494221b2cd81dc5331faad'

### Read the output

The dataset location is returned in the outputs field, therefore you can get the location by calling get_data_run.outputs['source_data'] and use the get_dataitem function to get the dataset itself.

In [16]:
from mlrun.run import get_dataitem
dataset = get_dataitem(get_data_run.outputs['source_data'])

You can also get the data as a Pandas Dataframe by calling the dataset.as_df().

In [17]:
dataset.as_df()

Unnamed: 0,sepal length (cm),sepal width (cm),petal length (cm),petal width (cm),label
0,5.1,3.5,1.4,0.2,0
1,4.9,3.0,1.4,0.2,0
2,4.7,3.2,1.3,0.2,0
3,4.6,3.1,1.5,0.2,0
4,5.0,3.6,1.4,0.2,0
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,2
146,6.3,2.5,5.0,1.9,2
147,6.5,3.0,5.2,2.0,2
148,6.2,3.4,5.4,2.3,2


### Saving the artifacts in a unique folder for each run

In the previous steps when we ran the function it overwrites the existing artifact every time we run it. <br>
in this step, we'd like to save the result (source data file) on a different folder per job run. <br>
In order to do that we are setting the artifact path to use the {run.uid} parameter <br>
Now, under the artifact path you should be able to see the source_data file resides under a new folder.

In [18]:
out = mlconf.artifact_path 

get_data_run = gen_data_func.run(name='get_data',
                         handler=get_data,
                         inputs={'source_url': source_url},
                         project=project_name,
                         local=True,
                         artifact_path = path.join(out, '{{run.uid}}'))

> 2021-01-10 14:30:15,324 [info] starting run get_data uid=49091bc3a759416094f02047ce6c276c DB=http://mlrun-api:8080


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-orz,...ce6c276c,0,Jan 10 14:30:15,completed,get_data,v3io_user=orzkind=owner=orzhost=jupyter-orz-6fd46d5d99-t2d5w,source_url,,,source_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run 49091bc3a759416094f02047ce6c276c --project getting-started-tutorial-orz , !mlrun logs 49091bc3a759416094f02047ce6c276c --project getting-started-tutorial-orz
> 2021-01-10 14:30:15,811 [info] run executed, status=completed


## Step 3: Run the Function on a Cluster

Now, we'd like to run the function on the cluster itself as opposed to running it locally in the jupyter pod as we've done in the previous step. <br>
By doing that we can leverage the cluster resources and run a more resource intensive workloads. <br>
MLRun helps us to easily run our code without the hassle of creating Yamls and build images. <br>
In order to run on the cluster all we need to do is to change the "local" flag to False. <br>

When running the function as a pod in Kubernetes the output needs to be written to a shared file system where the artifact path resides. <br>
In order to do that one need to add "apply(mount_v3io())" to attach an Iguazio Data Science Platform data volume (a.k.a. "a v3io volume") to the project function. <br>
This connects your function to the platform's shared file system and allows you to pass data to and from the platform. <br>
you can find more info about the various mount options in the next tutorial

In [19]:
from mlrun import mount_v3io

In [20]:
get_data_func = project.func('get-data').apply(mount_v3io())
get_data_run = get_data_func.run(name='get_data',
                                 handler='get_data',
                                 inputs={'source_url': source_url},
                                 local=False)

> 2021-01-10 14:30:15,837 [info] starting run get_data uid=dc9e6e5e17d24b559571fe8c4b116ccb DB=http://mlrun-api:8080
> 2021-01-10 14:30:15,990 [info] Job is running in the background, pod: get-data-5dr4f
> 2021-01-10 14:30:19,801 [info] run executed, status=completed
final state: completed


project,uid,iter,start,state,name,labels,inputs,parameters,results,artifacts
getting-started-tutorial-orz,...4b116ccb,0,Jan 10 14:30:19,completed,get_data,v3io_user=orzkind=jobowner=orzhost=get-data-5dr4f,source_url,,,source_data


to track results use .show() or .logs() or in CLI: 
!mlrun get run dc9e6e5e17d24b559571fe8c4b116ccb --project getting-started-tutorial-orz , !mlrun logs dc9e6e5e17d24b559571fe8c4b116ccb --project getting-started-tutorial-orz
> 2021-01-10 14:30:22,107 [info] run executed, status=completed


In [21]:
print(gen_data_func.to_yaml())

kind: job
metadata:
  name: get-data
  tag: ''
  project: getting-started-tutorial-orz
spec:
  command: ''
  args: []
  image: mlrun/ml-models
  volumes:
  - flexVolume:
      driver: v3io/fuse
      options:
        accessKey: 51316d7e-86f1-4fe3-84be-2ce0135f46a1
    name: v3io
  volume_mounts:
  - mountPath: /v3io
    name: v3io
    subPath: ''
  - mountPath: /User
    name: v3io
    subPath: users/orz
  env:
  - name: V3IO_API
    value: v3io-webapi.default-tenant.svc:8081
  - name: V3IO_USERNAME
    value: orz
  - name: V3IO_ACCESS_KEY
    value: 51316d7e-86f1-4fe3-84be-2ce0135f46a1
  default_handler: ''
  entry_points:
    get_data:
      name: get_data
      doc: ''
      parameters:
      - name: context
        default: ''
      - name: source_url
        default: ''
      - name: format
        default: csv
      outputs:
      - default: ''
      lineno: 3
  description: ''
  build:
    functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlcgoKZGVmIGdldF

## Step 4: View jobs in the UI

Go to Iguazio dashboard, select your project from the projects screen and then go to the jobs and workflow screen by clicking on the link on the left hand menu. <br>
In our case, after running the get_data twice you should see 3 records with type local ("<>") and type job. <br>
In this view you can track all jobs running in your project along with detailed information. <br>
click on the name of the jobs opens up several tabs with additional information such as: input dataset , artifacts that were generated by the job, results, logs etc.. <br>

<img src="./images/Jobs.jpg" alt="Jobs" width="800"/>

## Step 5: Schedule jobs

In order to schedule a job, user can set the schedule parameter to the run command. The scheduling is done using a crontab format. <br>
Another way to schedule a job is to do it from the UI. Under the jobs and monitoring screen you can create a job using the "New Job" wizard. <br>
At the end of the wizard you can set the scheduling.In the example below we set our job to run every 30min

In [22]:
get_data_func = project.func('get-data').apply(mount_v3io())
get_data_run = get_data_func.run(name='get_data',
                                 handler='get_data',
                                 local=False,
                                 inputs={'source_url': source_url},schedule='*/30 * * * *')

> 2021-01-10 14:30:22,124 [info] starting run get_data uid=29b1601fb0b6470d8ba9d238b688a267 DB=http://mlrun-api:8080
> 2021-01-10 14:30:22,215 [info] task scheduled, {'schedule': '*/30 * * * *', 'project': 'getting-started-tutorial-orz', 'name': 'get_data'}


In [23]:
from mlrun import get_run_db
print(get_run_db().list_schedules(project_name))

schedules=[ScheduleOutput(name='get_data', kind=<ScheduleKinds.job: 'job'>, scheduled_object={'task': {'spec': {'inputs': {'source_url': 'https://s3.wasabisys.com/iguazio/data/iris/iris_dataset.csv'}, 'output_path': '/v3io/projects/getting-started-tutorial-orz/artifacts', 'function': 'getting-started-tutorial-orz/get-data@103f17c4662aa76b87deeb35554315a40fabd7be', 'secret_sources': [], 'scrape_metrics': False, 'handler': 'get_data'}, 'metadata': {'uid': '29b1601fb0b6470d8ba9d238b688a267', 'name': 'get_data', 'project': 'getting-started-tutorial-orz', 'labels': {'v3io_user': 'orz', 'kind': 'job', 'owner': 'orz'}, 'iteration': 0}, 'status': {'state': 'created'}}, 'schedule': '*/30 * * * *'}, cron_trigger=ScheduleCronTrigger(year=None, month='*', day='*', week=None, day_of_week='*', hour='*', minute='*/30', second=None, start_date=None, end_date=None, timezone=None, jitter=None), desired_state=None, labels={'v3io_user': 'orz', 'kind': 'job', 'owner': 'orz'}, creation_time=datetime.datetim

In [24]:
get_run_db().get_schedule(project_name, 'get_data')

ScheduleOutput(name='get_data', kind=<ScheduleKinds.job: 'job'>, scheduled_object={'task': {'spec': {'inputs': {'source_url': 'https://s3.wasabisys.com/iguazio/data/iris/iris_dataset.csv'}, 'output_path': '/v3io/projects/getting-started-tutorial-orz/artifacts', 'function': 'getting-started-tutorial-orz/get-data@103f17c4662aa76b87deeb35554315a40fabd7be', 'secret_sources': [], 'scrape_metrics': False, 'handler': 'get_data'}, 'metadata': {'uid': '29b1601fb0b6470d8ba9d238b688a267', 'name': 'get_data', 'project': 'getting-started-tutorial-orz', 'labels': {'v3io_user': 'orz', 'kind': 'job', 'owner': 'orz'}, 'iteration': 0}, 'status': {'state': 'created'}}, 'schedule': '*/30 * * * *'}, cron_trigger=ScheduleCronTrigger(year=None, month='*', day='*', week=None, day_of_week='*', hour='*', minute='*/30', second=None, start_date=None, end_date=None, timezone=None, jitter=None), desired_state=None, labels={'v3io_user': 'orz', 'kind': 'job', 'owner': 'orz'}, creation_time=datetime.datetime(2021, 1, 

<img src="./images/func-schedule.JPG" alt="scheduled-jobs" width="1400"/>

## Don't forget the remove the scheduled job...

Delete the job

In [25]:
get_run_db().delete_schedule(project_name, 'get_data')

Verify that the scheduled job has been deleted

In [26]:
#get_run_db().get_schedule(project_name,'get_data')

## Save Your Project Configuration

Use the `save` MLRun project method to save your project definitions to a project-configuration file in your project directory (i.e. conf).
The default name of the project file is **project.yaml**, but you can optionally change it by setting the `filepath` parameter of the `save` method.

In [27]:
project.save() 

## Done!

Congratulation! You've completed tutorial 1 of the Iguazio Data Science Platform.
Go to [Tutorial 2](tutorial-2-Model-training.ipynb) to learn about model deployment.