# Scoring Pipeline

In this notebook we create a scoring pipeline that takes a registered model and new data to produce a set of predictions.

In [1]:
from azureml.core import Workspace, Dataset, Datastore, Experiment
from azureml.data.data_reference import DataReference
from azureml.pipeline.core import PipelineData, PipelineParameter, Pipeline
from azureml.pipeline.steps import PythonScriptStep, DataTransferStep
from azureml.core.runconfig import RunConfiguration, CondaDependencies, DEFAULT_CPU_IMAGE
from azureml.core.compute import AmlCompute

In [2]:
DATASTOR = "workspaceblobstore"

In [3]:
ws = Workspace.from_config()
dstor = Datastore.get(ws, DATASTOR)
experiment = Experiment(ws, "airlift")
aml_compute_target = AmlCompute(ws, name="onenode-cpu")

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


In [4]:
conda_run_config = RunConfiguration(framework="python")
conda_run_config.target = aml_compute_target
conda_run_config.environment.docker.enabled = True
conda_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
cd = CondaDependencies.create(pip_packages=["azureml-sdk[automl]", "scikit-learn", "pandas==0.24"], conda_packages=["numpy"])
conda_run_config.environment.python.conda_dependencies = cd

In [5]:
target_name_param = PipelineParameter(name="target_name", default_value="BOUGHT_CATEGORY_2")
scored_data = PipelineData(name="scored_data", datastore=dstor)
dataset_to_score = "AirliftDataset-Score"                        

In [6]:
score_step = PythonScriptStep(script_name="score.py",
                              name="score_data", 
                              arguments=["--model_id", target_name_param, "--dataset_to_score", dataset_to_score, "--output_scores", scored_data],
                              compute_target=aml_compute_target, 
                              runconfig=conda_run_config, 
                              outputs=[scored_data])

In [7]:
#tosql_step = DataTransferStep()

In [9]:
pipeline = Pipeline(workspace=ws, steps=[score_step])

In [9]:
experiment.submit(pipeline, pipeline_params={"target_name": "BOUGHT_CATEGORY_3"})

Created step score_data [014b8aeb][1fd9e495-ded3-4bc0-b770-c34204f72bfa], (This step will run and generate new outputs)
Submitted pipeline run: 43247aee-9c8b-4708-b8e9-022c9e6c136c


Experiment,Id,Type,Status,Details Page,Docs Page
airlift,43247aee-9c8b-4708-b8e9-022c9e6c136c,azureml.PipelineRun,NotStarted,Link to Azure Portal,Link to Documentation
