## **Running batch predictions**

Now that we have a trained model, we can run predictions with it in two ways:
* Running batch transformations with a dataset stored in S3
* Running real-time inferences through APIs to a SageMaker Endpoint

For this example we will follow the batch transformation method, using the inference data stored in S3...

### **Inference with SageMaker batch transformation**

In [None]:
# Location of your training data in Amazon S3
# Change this for your own bucket name:
bucket = 'rodzanto2020ml'
# Change this for the location of teh data to run inference on:
prefix = 'mediaset/final'
inference_files = 'infer'
# Change this for the name of your AutoML job:
auto_ml_job_name = 'automl-ms-shuf-sdk-24-21-26-31'

In [None]:
import sagemaker
import boto3
from sagemaker import get_execution_role

region = boto3.Session().region_name
session = sagemaker.Session()
role = get_execution_role()

# This is the client we will use to interact with SageMaker AutoPilot
sm = boto3.Session().client(service_name='sagemaker',region_name=region)

In [None]:
best_candidate = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['BestCandidate']
best_candidate_name = best_candidate['CandidateName']
#print(best_candidate)
print("CandidateName: " + best_candidate_name)
print("FinalAutoMLJobObjectiveMetricName: " + best_candidate['FinalAutoMLJobObjectiveMetric']['MetricName'])
print("FinalAutoMLJobObjectiveMetricValue: " + str(best_candidate['FinalAutoMLJobObjectiveMetric']['Value']))

In [None]:
timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())
model_name = best_candidate_name + timestamp_suffix + "-model"
model = sm.create_model(Containers=best_candidate['InferenceContainers'],
                            ModelName=model_name,
                            ExecutionRoleArn=role)

In [None]:
transform_output = 's3://{}/{}/infer-results/'.format(bucket, prefix);

transformer = sagemaker.transformer.Transformer(model_name=model_name,
                         instance_count=1,
                         instance_type='ml.m5.xlarge',
                         output_path=transform_output)

In [None]:
input_data_transform = 's3://{}/{}/{}'.format(bucket, prefix, inference_files)

transformer.transform(data=input_data_transform, split_type='Line', content_type='text/csv', wait=False)
print("Starting transform job {}".format(transformer._current_job_name))

In [None]:
print ('Batch Transform JobStatus')
print('------------------------------')

describe_response = sm.describe_transform_job(TransformJobName=transformer._current_job_name)
print (strftime('%d-%H-%M-%S', gmtime()) + " - " + describe_response['TransformJobStatus'])
job_run_status = describe_response['TransformJobStatus']
    
while job_run_status not in ('Failed', 'Completed', 'Stopped'):
    describe_response = sm.describe_transform_job(TransformJobName=transformer._current_job_name)
    job_run_status = describe_response['TransformJobStatus']
    print(strftime('%d-%H-%M-%S', gmtime()) + " - " + describe_response['TransformJobStatus'])
    sleep(30)