A typical data science project lifecycle goes through many phases. The first phase is to understand the business and the data, followed by data preparation, model training, model evaluation, and production deployment. These phases are not simple. They’re tedious, repetitive, and limited to a few skilled individuals like researchers, data scientists, and ML engineers.

In practice, data scientists [spend a significant portion of their time doing repetitive ](https://cloudzone.io/what-is-automl-and-why-consider-it/)work like cleaning, sourcing, and preparing data for training a model.

The [demand for intelligent systems](https://business.linkedin.com/content/dam/me/business/en-us/talent-solutions/emerging-jobs-report/Emerging_Jobs_Report_U.S._FINAL.pdf) has continued to grow, yet most organizations lack the resource to keep up with the demand. The fact that many businesses are competing for the scarce talents in Artificial Intelligent (AI) field only made it worse. 

This is where AutoML becomes crucial. AutoML allows you to automate and apply ML to a business problem. It provides access to AI development for those without theoretical knowledge.

This post will look at Amazon’s AutoML offering (Amazon SageMaker Autopilot) and apply it to train a model for detecting credit card fraud.

### Amazon SageMaker Autopilot

Announced in 2019, [SageMaker Autopilot](https://medium.com/@samuelabiodun/training-a-model-to-detect-credit-card-fraud-with-amazon-sagemaker-autopilot-d49a6b667b2e) is Amazon’s AutoML offering that allows you to build, train end-to-end ML models. It simplifies your tasks by exploring your data, finds the optimum algorithm that best suits your data, and prepares the data for model tuning and training. 

The autopilot generates notebooks for each trial as it tries to pick the best algorithm for your data. The notebooks provide detailed visibility into how data was wrangled as well as the model selection process. In addition, the notebooks contain educational materials that allow you to learn about the training process and how to train your own experiments. It indirectly teaches you theoretical knowledge. 

![](https://d1.awsstatic.com/SageMaker/SageMaker%20reInvent%202020/Autopilot/product-page-diagram_SageMaker_Auto-Pilot_dk-bg%402x.e2d27caf8ec3224f1498d904aee630f61c847359.png)

#### Some Benefits of Amazon SageMaker Autopilot

Some of the added benefits you get with SageMaker Autopilot include:

- Automatic training of ML models
- Automatic feature engineering
- Provides visibility into the training process
- Step-by-step instructions on how the candidates were selected
- You have the flexibility to tune your models if needs be, manually
- You can use it from the UI and the SDK, thereby allowing room for automation.

While you can use AutoML to automate Machine Learning (ML) processes for a variety of purposes, such as price predictions, churn prediction, risk assessment, and more.  In this tutorial, we’ll use Amazon SageMaker Autopilot to train a model for detecting fraudulent transactions.

### Automatic Model Training with Amazon SageMaker Autopilot

There are specific characteristics with fraudulent transactions that can be learned to differentiate them from legitimate transactions. We’ll use an existing public dataset hosted on [Kaggle](https://www.kaggle.com/mlg-ulb/creditcardfraud). The dataset can be found [here](https://www.kaggle.com/mlg-ulb/creditcardfraud). The dataset contains transactions made with credit cards in September 2013 by European cardholders.

In [1]:
!mkdir ~/.kaggle
!echo '{"username":"your-user-name","key":"your-key"}' >> ~/.kaggle/kaggle.json 
!chmod 600 /home/ec2-user/.kaggle/kaggle.json

Install Kaggle library using pip

In [2]:
!pip install kaggle



Run the code below to download the dataset from Kaggle.  We will do a bit of pre-processing.
Download credit card fraud detection dataset

In [3]:
!kaggle datasets download -d  "arockiaselciaa/creditcardcsv"

creditcardcsv.zip: Skipping, found more recently modified local copy (use --force to force download)


In [4]:
!ls -la

total 522960
drwxr-xr-x  5 ec2-user ec2-user      4096 Mar  7 16:57 .
drwx------ 23 ec2-user ec2-user      4096 Mar  7 14:44 ..
-rw-rw-r--  1 ec2-user ec2-user     52071 Feb 21 14:47 AutoPilotFraudCase.ipynb
-rw-rw-r--  1 ec2-user ec2-user 150828752 Nov 15 08:11 creditcard.csv
-rw-rw-r--  1 ec2-user ec2-user  69155672 Mar  7 14:58 creditcardcsv.zip
-rw-rw-r--  1 ec2-user ec2-user     18821 Mar  7 16:57 FraudCase.ipynb
-rw-rw-r--  1 ec2-user ec2-user    113922 Mar  7 16:53 inference_results.csv
drwxrwxr-x  2 ec2-user ec2-user      4096 Mar  7 15:00 .ipynb_checkpoints
drwx------  2 root     root         16384 Feb 20 13:41 lost+found
-rw-rw-r--  1 ec2-user ec2-user 159545952 Feb 20 14:40 processed_cerditcard.csv
drwxr-xr-x  2 ec2-user ec2-user      4096 Feb 20 13:41 .sparkmagic
-rw-rw-r--  1 ec2-user ec2-user  31057189 Mar  7 16:11 test_data.csv
-rw-rw-r--  1 ec2-user ec2-user 124692809 Mar  7 16:11 train_data.csv


In [5]:
!unzip -o creditcardcsv.zip

Archive:  creditcardcsv.zip
  inflating: creditcard.csv          


In [1]:
import sagemaker
import boto3
from sagemaker import get_execution_role

region = boto3.Session().region_name

session = sagemaker.Session()
bucket = session.default_bucket()
prefix = 'sagemaker/autopilot-fraud-case'

role = get_execution_role()
sm = boto3.Session().client(service_name='sagemaker',region_name=region)
dataset_path = './creditcard.csv'

Let's check the dataset to see what kind of data it contains. The AutoPilot  process usually takes some time, and it's a goo practice inspect the dataset starting a job

In [2]:
import pandas as pd

df = pd.read_csv(dataset_path)
pd.set_option('display.max_columns', 500)     # Make sure we can see all of the columns
pd.set_option('display.max_rows', 10)         # Keep the output on one page
df

Unnamed: 0,Time,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,V11,V12,V13,V14,V15,V16,V17,V18,V19,V20,V21,V22,V23,V24,V25,V26,V27,V28,Amount,Class
0,0.0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,0.090794,-0.551600,-0.617801,-0.991390,-0.311169,1.468177,-0.470401,0.207971,0.025791,0.403993,0.251412,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053,149.62,0
1,0.0,1.191857,0.266151,0.166480,0.448154,0.060018,-0.082361,-0.078803,0.085102,-0.255425,-0.166974,1.612727,1.065235,0.489095,-0.143772,0.635558,0.463917,-0.114805,-0.183361,-0.145783,-0.069083,-0.225775,-0.638672,0.101288,-0.339846,0.167170,0.125895,-0.008983,0.014724,2.69,0
2,1.0,-1.358354,-1.340163,1.773209,0.379780,-0.503198,1.800499,0.791461,0.247676,-1.514654,0.207643,0.624501,0.066084,0.717293,-0.165946,2.345865,-2.890083,1.109969,-0.121359,-2.261857,0.524980,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752,378.66,0
3,1.0,-0.966272,-0.185226,1.792993,-0.863291,-0.010309,1.247203,0.237609,0.377436,-1.387024,-0.054952,-0.226487,0.178228,0.507757,-0.287924,-0.631418,-1.059647,-0.684093,1.965775,-1.232622,-0.208038,-0.108300,0.005274,-0.190321,-1.175575,0.647376,-0.221929,0.062723,0.061458,123.50,0
4,2.0,-1.158233,0.877737,1.548718,0.403034,-0.407193,0.095921,0.592941,-0.270533,0.817739,0.753074,-0.822843,0.538196,1.345852,-1.119670,0.175121,-0.451449,-0.237033,-0.038195,0.803487,0.408542,-0.009431,0.798278,-0.137458,0.141267,-0.206010,0.502292,0.219422,0.215153,69.99,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
284802,172786.0,-11.881118,10.071785,-9.834783,-2.066656,-5.364473,-2.606837,-4.918215,7.305334,1.914428,4.356170,-1.593105,2.711941,-0.689256,4.626942,-0.924459,1.107641,1.991691,0.510632,-0.682920,1.475829,0.213454,0.111864,1.014480,-0.509348,1.436807,0.250034,0.943651,0.823731,0.77,0
284803,172787.0,-0.732789,-0.055080,2.035030,-0.738589,0.868229,1.058415,0.024330,0.294869,0.584800,-0.975926,-0.150189,0.915802,1.214756,-0.675143,1.164931,-0.711757,-0.025693,-1.221179,-1.545556,0.059616,0.214205,0.924384,0.012463,-1.016226,-0.606624,-0.395255,0.068472,-0.053527,24.79,0
284804,172788.0,1.919565,-0.301254,-3.249640,-0.557828,2.630515,3.031260,-0.296827,0.708417,0.432454,-0.484782,0.411614,0.063119,-0.183699,-0.510602,1.329284,0.140716,0.313502,0.395652,-0.577252,0.001396,0.232045,0.578229,-0.037501,0.640134,0.265745,-0.087371,0.004455,-0.026561,67.88,0
284805,172788.0,-0.240440,0.530483,0.702510,0.689799,-0.377961,0.623708,-0.686180,0.679145,0.392087,-0.399126,-1.933849,-0.962886,-1.042082,0.449624,1.962563,-0.608577,0.509928,1.113981,2.897849,0.127434,0.265245,0.800049,-0.163298,0.123205,-0.569159,0.546668,0.108821,0.104533,10.00,0


In [3]:
df = df.drop(['Amount', 'Time'], axis = 1)

### Prepare the dataset
 In this step, we’ll do a couple of things.

* Split the data into train and test data set
* The test data will be used to perform inference later on
* Upload the splits to s3 bucket


In [4]:
train_data = df.sample(frac=0.8, random_state=200)
test_data = df.drop(train_data.index)

test_data_no_y = test_data.drop(columns=['Class'])
test_data_no_y.head()

Unnamed: 0,V1,V2,V3,V4,V5,V6,V7,V8,V9,V10,V11,V12,V13,V14,V15,V16,V17,V18,V19,V20,V21,V22,V23,V24,V25,V26,V27,V28
0,-1.359807,-0.072781,2.536347,1.378155,-0.338321,0.462388,0.239599,0.098698,0.363787,0.090794,-0.5516,-0.617801,-0.99139,-0.311169,1.468177,-0.470401,0.207971,0.025791,0.403993,0.251412,-0.018307,0.277838,-0.110474,0.066928,0.128539,-0.189115,0.133558,-0.021053
2,-1.358354,-1.340163,1.773209,0.37978,-0.503198,1.800499,0.791461,0.247676,-1.514654,0.207643,0.624501,0.066084,0.717293,-0.165946,2.345865,-2.890083,1.109969,-0.121359,-2.261857,0.52498,0.247998,0.771679,0.909412,-0.689281,-0.327642,-0.139097,-0.055353,-0.059752
9,-0.338262,1.119593,1.044367,-0.222187,0.499361,-0.246761,0.651583,0.069539,-0.736727,-0.366846,1.017614,0.83639,1.006844,-0.443523,0.150219,0.739453,-0.54098,0.476677,0.451773,0.203711,-0.246914,-0.633753,-0.120794,-0.38505,-0.069733,0.094199,0.246219,0.083076
10,1.449044,-1.176339,0.91386,-1.375667,-1.971383,-0.629152,-1.423236,0.048456,-1.720408,1.626659,1.199644,-0.67144,-0.513947,-0.095045,0.23093,0.031967,0.253415,0.854344,-0.221365,-0.387226,-0.009302,0.313894,0.02774,0.500512,0.251367,-0.129478,0.04285,0.016253
13,1.069374,0.287722,0.828613,2.71252,-0.178398,0.337544,-0.096717,0.115982,-0.221083,0.46023,-0.773657,0.323387,-0.011076,-0.178485,-0.655564,-0.199925,0.124005,-0.980496,-0.982916,-0.153197,-0.036876,0.074412,-0.071407,0.104744,0.548265,0.104094,0.021491,0.021293


In [5]:
train_file = 'train_data.csv'
train_data.to_csv(train_file, index=False, header=True)
train_path = session.upload_data(path=train_file, key_prefix=prefix + "/train")
print(f"Upload file to {train_path}")

Upload file to s3://sagemaker-eu-west-1-xxxxxxxxxxxxxx/sagemaker/autopilot-fraud-case/train/train_data.csv


In [6]:
test_file = 'test_data.csv'
test_data_no_y.to_csv(test_file, index=False, header=False)
test_path = session.upload_data(path=test_file, key_prefix=prefix + "/test")
print(f"Upload file to {test_path}")

Upload file to s3://sagemaker-eu-west-1-xxxxxxxxxxxxxx/sagemaker/autopilot-fraud-case/test/test_data.csv


## Create a SageMaker Autopilot Job
To find the best performing model, we first need to configure the Autopilot's job, input data, and output data. SageMaker Autopilot analyses the dataset to develop a list of ML pipelines that should be tried on the data. It also performs features engineering on the data, such as feature transformation, on each dataset feature. Finally, it performs model tuning, where the top-performing pipeline is selected along.

In [7]:

auto_ml_job_config = {
        'CompletionCriteria': {
            'MaxCandidates': 5
        }
    }

input_data_config = [{
      'DataSource': {
        'S3DataSource': {
          'S3DataType': 'S3Prefix',
          'S3Uri': 's3://{}/{}/train'.format(bucket, prefix)
        }
      },
      'TargetAttributeName': 'Class'
    }
  ]

output_data_config = {
    'S3OutputPath': 's3://{}/{}/output/'.format(bucket, prefix)
  }

In [8]:
from time import gmtime, strftime, sleep
timestamp_suffix = strftime('%d-%H-%M-%S', gmtime())

auto_ml_job_name = f'automl-fraudcase-{timestamp_suffix}'
print(f"AutoMLJobName:{auto_ml_job_name}")

AutoMLJobName:automl-fraudcase-20-16-55-10


In [9]:
role = get_execution_role()
sm = boto3.Session().client(service_name='sagemaker',region_name=region)
sm.create_auto_ml_job(AutoMLJobName=auto_ml_job_name,
                      InputDataConfig=input_data_config,
                      OutputDataConfig=output_data_config,
                      AutoMLJobConfig=auto_ml_job_config,
                      RoleArn=role)

{'AutoMLJobArn': 'arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:automl-job/automl-fraudcase-20-16-55-10',
 'ResponseMetadata': {'RequestId': 'cde1a0cd-2c9d-4bce-8faa-61f65aecb5ef',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'cde1a0cd-2c9d-4bce-8faa-61f65aecb5ef',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '99',
   'date': 'Sat, 20 Mar 2021 16:55:12 GMT'},
  'RetryAttempts': 0}}

You can monitor the progress by running the following code. 

In [10]:
print('JobStatus  Secondary Status')

print('-------------------------------------')
describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)
print(f"{describe_response['AutoMLJobStatus']} - {describe_response['AutoMLJobSecondaryStatus']}")

job_run_status = describe_response['AutoMLJobStatus']
      
while job_run_status not in ('Failed', 'Completed', 'Stopped'):
      
      describe_response = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)
      job_run_status = describe_response['AutoMLJobStatus']

      print(f"{describe_response['AutoMLJobStatus']} - {describe_response['AutoMLJobSecondaryStatus']}")
      sleep(30)

JobStatus  Secondary Status
-------------------------------------
InProgress - Starting
InProgress - Starting
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - AnalyzingData
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - FeatureEngineering
InProgress - Feature

## Training Results
If you want to see the training result, you can return information to the SageMaker job using the `describe_auto_ml_job` API.

In [11]:
best_candidate = sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['BestCandidate']
best_candidate_name = best_candidate['CandidateName']

print(best_candidate)

print('\n')
print(f"CandidateName:{best_candidate_name}")
print(f"FinalAutoMLJobObjectiveMetricName: {best_candidate['FinalAutoMLJobObjectiveMetric']['MetricName']}")
print(f"FinalAutoMLJobObjectiveMetricValue: {str(best_candidate['FinalAutoMLJobObjectiveMetric']['Value'])}")

{'CandidateName': 'tuning-job-1-74d298e7a88e49eeab-002-1390c5d8', 'FinalAutoMLJobObjectiveMetric': {'MetricName': 'validation:f1', 'Value': 0.8329600095748901}, 'ObjectiveStatus': 'Succeeded', 'CandidateSteps': [{'CandidateStepType': 'AWS::SageMaker::ProcessingJob', 'CandidateStepArn': 'arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:processing-job/db-1-71d0bf747fe54368865c7f8ed3005f07e7c076b4588946b9bdb5d08af0', 'CandidateStepName': 'db-1-71d0bf747fe54368865c7f8ed3005f07e7c076b4588946b9bdb5d08af0'}, {'CandidateStepType': 'AWS::SageMaker::TrainingJob', 'CandidateStepArn': 'arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:training-job/automl-fra-dpp1-1-4b44f9f9b51d4eb5a141ac2fbb7880368706514d90d94', 'CandidateStepName': 'automl-fra-dpp1-1-4b44f9f9b51d4eb5a141ac2fbb7880368706514d90d94'}, {'CandidateStepType': 'AWS::SageMaker::TransformJob', 'CandidateStepArn': 'arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:transform-job/automl-fra-dpp1-csv-1-452c37f03db0411099e89cad078a212212637eb1c', 'CandidateStepNa

## Perform Batch Inference Using the Best Candidate
Having trained some models, here, we'll perform a batch inference job using the best candidate presented by the SageMaker Autopilot. The steps are as follows:
Create a model from the best candidate
Leverage batch inference using the Amazon SageMaker Autopilot job


In [12]:
model_name = 'automl-fraudcase-model-' + timestamp_suffix

model = sm.create_model(Containers=best_candidate['InferenceContainers'],
                            ModelName=model_name,
                            ExecutionRoleArn=role)

print('Model ARN corresponding to the best candidate is : {}'.format(model['ModelArn']))

Model ARN corresponding to the best candidate is : arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:model/automl-fraudcase-model-20-16-55-10


In [13]:
transform_job_name = 'automl-fraudcase-transform-' + timestamp_suffix

transform_input = {
        'DataSource': {
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': test_path
            }
        },
        'ContentType': 'text/csv',
        'CompressionType': 'None',
        'SplitType': 'Line'
    }

transform_output = {
        'S3OutputPath': 's3://{}/{}/inference-results'.format(bucket, prefix),
    }

transform_resources = {
        'InstanceType': 'ml.m5.4xlarge',
        'InstanceCount': 1
    }

sm.create_transform_job(TransformJobName = transform_job_name,
                        ModelName = model_name,
                        TransformInput = transform_input,
                        TransformOutput = transform_output,
                        TransformResources = transform_resources
)



{'TransformJobArn': 'arn:aws:sagemaker:eu-west-1:xxxxxxxxxxxxxx:transform-job/automl-fraudcase-transform-20-16-55-10',
 'ResponseMetadata': {'RequestId': 'a3bea6f4-249c-48dc-affe-ab9592804155',
  'HTTPStatusCode': 200,
  'HTTPHeaders': {'x-amzn-requestid': 'a3bea6f4-249c-48dc-affe-ab9592804155',
   'content-type': 'application/x-amz-json-1.1',
   'content-length': '115',
   'date': 'Sat, 20 Mar 2021 17:30:19 GMT'},
  'RetryAttempts': 0}}

In [14]:
print ('JobStatus')
print('----------')


describe_response = sm.describe_transform_job(TransformJobName = transform_job_name)
job_run_status = describe_response['TransformJobStatus']
print (job_run_status)

while job_run_status not in ('Failed', 'Completed', 'Stopped'):
    describe_response = sm.describe_transform_job(TransformJobName = transform_job_name)
    job_run_status = describe_response['TransformJobStatus']
    print (job_run_status)
    sleep(30)

JobStatus
----------
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
InProgress
Completed


In [15]:
s3_output_key = '{}/inference-results/test_data.csv.out'.format(prefix);
local_inference_results_path = 'inference_results.csv'

s3 = boto3.resource('s3')
inference_results_bucket = s3.Bucket(session.default_bucket())
inference_results_bucket.download_file(s3_output_key, local_inference_results_path);

data = pd.read_csv(local_inference_results_path, sep=';')
pd.set_option('display.max_rows', 10)         # Keep the output on one page
data


Unnamed: 0,0
0,0
1,0
2,0
3,0
4,0
...,...
56955,0
56956,0
56957,0
56958,0


In [16]:
candidates = sm.list_candidates_for_auto_ml_job(AutoMLJobName=auto_ml_job_name, SortBy='FinalObjectiveMetricValue')['Candidates']
index = 1
for candidate in candidates:
  print(f"{str(index)} {candidate['CandidateName']} {str(candidate['FinalAutoMLJobObjectiveMetric']['Value'])}")
  index += 1

1 tuning-job-1-74d298e7a88e49eeab-002-1390c5d8 0.8329600095748901
2 tuning-job-1-74d298e7a88e49eeab-003-28129b77 0.8329600095748901
3 tuning-job-1-74d298e7a88e49eeab-005-9c6bbeb2 0.7474700212478638
4 tuning-job-1-74d298e7a88e49eeab-001-1c4725b4 0.6255506873130798
5 tuning-job-1-74d298e7a88e49eeab-004-7a5f2215 0.3401360511779785


In [17]:
sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['CandidateDefinitionNotebookLocation']


's3://sagemaker-eu-west-1-xxxxxxxxxxxxxx/sagemaker/autopilot-fraud-case/output/automl-fraudcase-20-16-55-10/sagemaker-automl-candidates/pr-1-767914be3a75482092d6f355391b55fcd738bcb28ae644f2bf212a108f/notebooks/SageMakerAutopilotCandidateDefinitionNotebook.ipynb'

In [18]:
sm.describe_auto_ml_job(AutoMLJobName=auto_ml_job_name)['AutoMLJobArtifacts']['DataExplorationNotebookLocation']

's3://sagemaker-eu-west-1-xxxxxxxxxxxxxx/sagemaker/autopilot-fraud-case/output/automl-fraudcase-20-16-55-10/sagemaker-automl-candidates/pr-1-767914be3a75482092d6f355391b55fcd738bcb28ae644f2bf212a108f/notebooks/SageMakerAutopilotDataExplorationNotebook.ipynb'

In [19]:
# Clean up
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket)

job_outputs_prefix = '{}/output/{}'.format(prefix,auto_ml_job_name)
bucket.objects.filter(Prefix=job_outputs_prefix).delete()

[{'ResponseMetadata': {'RequestId': '16JRCRZP5C0MDQ6E',
   'HostId': 'pt97LeaMBJlMV2aovp6HMRrLXWh40oidij7WU3HVAK49tZhe8et9grA53lgEYBZmgsBud0CdwvI=',
   'HTTPStatusCode': 200,
   'HTTPHeaders': {'x-amz-id-2': 'pt97LeaMBJlMV2aovp6HMRrLXWh40oidij7WU3HVAK49tZhe8et9grA53lgEYBZmgsBud0CdwvI=',
    'x-amz-request-id': '16JRCRZP5C0MDQ6E',
    'date': 'Sat, 20 Mar 2021 17:36:23 GMT',
    'content-type': 'application/xml',
    'transfer-encoding': 'chunked',
    'server': 'AmazonS3',
    'connection': 'close'},
   'RetryAttempts': 0},
  'Deleted': [{'Key': 'sagemaker/autopilot-fraud-case/output/automl-fraudcase-20-16-55-10/transformed-data/dpp3/csv/train/chunk_85.csv.out'},
   {'Key': 'sagemaker/autopilot-fraud-case/output/automl-fraudcase-20-16-55-10/transformed-data/dpp0/csv/validation/chunk_19.csv.out'},
   {'Key': 'sagemaker/autopilot-fraud-case/output/automl-fraudcase-20-16-55-10/transformed-data/dpp2/rpb/validation/chunk_2.csv.out'},
   {'Key': 'sagemaker/autopilot-fraud-case/output/automl-