# Deploy to a Batch Endpoint

After a model is trained, it will be deployed to get predictions from it. This can be done one by one in real time, or in batches. In this code, we will get a batch inference of the model we trained before.

My credentials, subscription id, resource group and workspace info, are all stored in a file called config.py. You have to make sure that you have yours in that file.

In [18]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# My subscription id, resource group and workspace are all in the file below.
import config

In [19]:
ml_client = MLClient(
    DefaultAzureCredential(), config.subscription_id, config.resource_group, config.workspace
)

In [20]:
model = ml_client.models.get(name="mlflow-titanic", version="1")

In [21]:
import datetime

endpoint_name = "batch-" + datetime.datetime.now().strftime("%m%d%H%M%f")
endpoint_name

'batch-08262104305772'

In [22]:
from azure.ai.ml.entities import BatchEndpoint

# create a batch endpoint
endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A batch endpoint for classifying survivors in titanic",
)

ml_client.batch_endpoints.begin_create_or_update(endpoint)

<azure.core.polling._poller.LROPoller at 0x7fabb4c2f8b0>

In [29]:
from azure.ai.ml.entities import BatchDeployment, BatchRetrySettings
from azure.ai.ml.constants import BatchDeploymentOutputAction

deployment = BatchDeployment(
    name="classifier-titanic-mlflow",
    description="A titanic classifier",
    endpoint_name=endpoint.name,
    model=model,
    compute="sckaraman1",
    instance_count=1,
    max_concurrency_per_instance=2,
    mini_batch_size=1,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
    retry_settings=BatchRetrySettings(max_retries=3, timeout=300),
    logging_level="info",
)
ml_client.batch_deployments.begin_create_or_update(deployment)

<azure.core.polling._poller.LROPoller at 0x7fabace4e350>

In [30]:
endpoint.defaults = {}

endpoint.defaults["deployment_name"] = deployment.name

ml_client.batch_endpoints.begin_create_or_update(endpoint)

<azure.core.polling._poller.LROPoller at 0x7fabacf83100>

## Let's prepare the data to predict

In [1]:
# This part is optional. I needed to prepare a dataset to predict. That's why I used the code below.
# Feel free to ignore it or use something similar yourself

# import pandas as pd
# import numpy as np

# df = pd.read_csv('data/titanic_ds.csv')

# df_toPredict = pd.DataFrame(columns=df.columns)
# df_toPredict = df_toPredict.drop('Survived', axis=1)

# for i in df_toPredict.columns:
#     df_toPredict[i] = np.random.choice(df[i].values, 10)
    
# df_toPredict.to_csv('data2/data_toPredict.csv')

In [31]:
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

data_path = "./data2"
dataset_name = "titanic-data-unlabeled"

titanic_dataset_unlabeled = Data(
    path=data_path,
    type=AssetTypes.URI_FOLDER,
    description="An unlabeled dataset for titanic classification",
    name=dataset_name,
)
ml_client.data.create_or_update(titanic_dataset_unlabeled)
     

titanic_dataset_unlabeled = ml_client.data.get(
    name="titanic-data-unlabeled", label="latest"
)

[32mUploading data2 (0.0 MBs): 100%|██████████| 319/319 [00:00<00:00, 33501.50it/s]
[39m



In [32]:
from azure.ai.ml import Input
from azure.ai.ml.constants import AssetTypes

input = Input(type=AssetTypes.URI_FOLDER, path=titanic_dataset_unlabeled.id)

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, 
    input=input)

ml_client.jobs.get(job.name)

Experiment,Name,Type,Status,Details Page
batch-08262104305772,batchjob-15f11832-cb7a-4138-9820-b48c06d26ab3,pipeline,Preparing,Link to Azure Machine Learning studio
