# Publishing a Pipeline

In the [previous lab](labdocs/Lab06A.md), you created a pipeline. Now you're going to publish it as a service.

## Connect to Your Workspace

The first thing you need to do is to connect to your workspace using the Azure ML SDK.

> **Note**: If the authenticated session with your Azure subscription has expired since you completed the previous exercise, you'll be prompted to reauthenticate.

In [1]:
import azureml.core
from azureml.core import Workspace

# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))

Ready to use Azure ML 1.2.0 to work with myaml


## Publish the Pipeline

After you've created and tested a pipeline, you can publish it as a REST service. You ran the pipeline in the [previous lab](labdocs/Lab06A.md), so you can get a reference to that run and use it to publish the pipeline.

In [4]:
# Get the most recent run of the pipeline
experiment_name = 'diabetes-training-pipeline'
pipeline_experiment = ws.experiments.get(experiment_name)
list(pipeline_experiment.get_runs())[0]

Experiment,Id,Type,Status,Details Page,Docs Page
diabetes-training-pipeline,66fcea11-3038-4dc3-8930-3880c16fb897,azureml.PipelineRun,Completed,Link to Azure Machine Learning studio,Link to Documentation


In [5]:
# Get the most recent run of the pipeline
experiment_name = 'diabetes-training-pipeline'
pipeline_experiment = ws.experiments.get(experiment_name)
pipeline_run = list(pipeline_experiment.get_runs())[0]

# Publish the pipeline from the run
published_pipeline = pipeline_run.publish_pipeline(
    name="Diabetes_Training_Pipeline", description="Trains diabetes model", version="1.0")

published_pipeline

Name,Id,Status,Endpoint
Diabetes_Training_Pipeline,24bfc568-d27e-48e4-a12d-57ea07ad9909,Active,REST Endpoint


Note that the published pipeline has an endpoint, which you can see in the **Endpoints** page (on the **Pipeline Endpoints** tab) in [Azure Machine Learning studio](https://ml.azure.com). You can also find its URI as a property of the published pipeline object:

In [6]:
rest_endpoint = published_pipeline.endpoint
print(rest_endpoint)

https://northeurope.api.azureml.ms/pipelines/v1.0/subscriptions/46926bff-fe7d-4284-bc62-eafdda8d8f2c/resourceGroups/DataSienceSolutionAzure/providers/Microsoft.MachineLearningServices/workspaces/myaml/PipelineRuns/PipelineSubmit/24bfc568-d27e-48e4-a12d-57ea07ad9909


## Use the Pipeline Endpoint

To use the endpoint, client applications need to make a REST call over HTTP. This request must be authenticated, so an authorization header is required. A real application would require a service principal with which to be authenticated, but to test this out, we'll use the authorization header from your current connection to your Azure workspace, which you can get using the following code:

In [7]:
from azureml.core.authentication import InteractiveLoginAuthentication

interactive_auth = InteractiveLoginAuthentication()
auth_header = interactive_auth.get_authentication_header()
print("Authentication header ready.")

Authentication header ready.


Now we're ready to call the REST interface. The pipeline runs asynchronously, so we'll get an identifier back, which we can use to track the pipeline experiment as it runs:

In [9]:
import requests

rest_endpoint = published_pipeline.endpoint
response = requests.post(rest_endpoint, 
                         headers=auth_header, 
                         json={"ExperimentName": experiment_name})
run_id = response.json()["Id"]
run_id

'71de73b6-31b8-4af4-ace4-33d9670a5c48'

Since we have the run ID, we can use the **RunDetails** widget to view the experiment as it runs.

> **Note**: The pipeline should complete quickly, because each step was configured to allow output reuse. This was done primarily for convenience and to save time in this course. In reality, you'd likely want the first step to run every time in case the data has changed, and trigger the subsequent steps only if the output from step one changes.


**allow_reuse**

Indicates whether the step should reuse previous results when re-run with the same settings. Reuse is enabled by default. If the step contents (scripts/dependencies) as well as inputs and parameters remain unchanged, the output from the previous run of this step is reused. When reusing the step, instead of submitting the job to compute, the results from the previous run are immediately made available to any subsequent steps. If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed.

In [10]:
from azureml.pipeline.core.run import PipelineRun
from azureml.widgets import RunDetails

published_pipeline_run = PipelineRun(ws.experiments[experiment_name], run_id)
RunDetails(published_pipeline_run).show()

_PipelineWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', …

## Schedule the Pipeline

Suppose the clinic for the diabetes patients collects new data each week, and adds it to the dataset. You could run the pipeline every week to retrain the model with the new data.

In [11]:
from azureml.pipeline.core import ScheduleRecurrence, Schedule

# Submit the Pipeline every Monday at 00:00 UTC
recurrence = ScheduleRecurrence(frequency="Week", interval=1, week_days=["Monday"], time_of_day="00:00")
weekly_schedule = Schedule.create(ws, name="weekly-diabetes-training", 
                                  description="Based on time",
                                  pipeline_id=published_pipeline.id, 
                                  experiment_name=experiment_name, 
                                  recurrence=recurrence)
print('Pipeline scheduled.')

Pipeline scheduled.


You can retrieve the schedules that are defined in thw workspace like this:

In [12]:
schedules = Schedule.list(ws)
schedules

[Pipeline(Name: weekly-diabetes-training,
 Id: c9e8c75e-61b4-48bf-81b3-377918f7a05d,
 Status: Active,
 Pipeline Id: 24bfc568-d27e-48e4-a12d-57ea07ad9909,
 Recurrence Details: Runs at 0:00 on Monday every Week)]

You can check the latest run like this:

In [13]:
pipeline_experiment = ws.experiments.get(experiment_name)
latest_run = list(pipeline_experiment.get_runs())[0]

latest_run.get_details()

{'runId': '411c465d-044c-496a-85af-71490974c070',
 'status': 'Completed',
 'startTimeUtc': '2020-04-13T20:19:34.459037Z',
 'endTimeUtc': '2020-04-13T20:19:39.234436Z',
 'properties': {'azureml.git.repository_uri': 'https://github.com/shoresh57/AI-ML-Workshop-Azure',
  'mlflow.source.git.repoURL': 'https://github.com/shoresh57/AI-ML-Workshop-Azure',
  'azureml.git.branch': 'master',
  'mlflow.source.git.branch': 'master',
  'azureml.git.commit': 'aaaab80ae2032fe66fde1dc3097315aba884e729',
  'mlflow.source.git.commit': 'aaaab80ae2032fe66fde1dc3097315aba884e729',
  'azureml.git.dirty': 'True',
  'azureml.runsource': 'azureml.PipelineRun',
  'runSource': 'Unavailable',
  'runType': 'Schedule',
  'azureml.parameters': '{}',
  'azureml.pipelineid': '24bfc568-d27e-48e4-a12d-57ea07ad9909'},
 'inputDatasets': [],
 'logFiles': {'logs/azureml/executionlogs.txt': 'https://myaml2155572532.blob.core.windows.net/azureml/ExperimentRun/dcid.411c465d-044c-496a-85af-71490974c070/logs/azureml/executionlog

### Disable the schedule

>It is important to note the best practice of disabling schedules when not in use. The number of schedule triggers allowed per month per region per subscription is 100,000. This is calculated using the project trigger counts for all active schedules.

In [22]:
for schedule in schedules: 
    print(schedule.id)
    if schedule.recurrence is not None:
        schedule_id = schedule.id

c9e8c75e-61b4-48bf-81b3-377918f7a05d


In [25]:
fetched_schedule = Schedule.get(ws, schedule_id)
print("Using schedule with id: {}".format(fetched_schedule.id))

Using schedule with id: c9e8c75e-61b4-48bf-81b3-377918f7a05d


In [26]:
fetched_schedule.disable(wait_for_provisioning=True)
fetched_schedule = Schedule.get(ws, schedule_id)
print("Disabled schedule {}. New status is: {}".format(fetched_schedule.id, fetched_schedule.status))

Provisioning status: Completed
Disabled schedule c9e8c75e-61b4-48bf-81b3-377918f7a05d. New status is: Disabled


In [27]:
schedules = Schedule.list(ws)
schedules

[]

> **More Information**: You can find out more about scheduling pipelines in the [documentation](https://docs.microsoft.com/azure/machine-learning/how-to-schedule-pipelines).