### SageMaker Large Scale prediction

In [1]:
import sagemaker
import boto3

sagemaker_session = sagemaker.Session()
account_id =  boto3.client('sts').get_caller_identity().get('Account')
region = boto3.session.Session().region_name


#role = sagemaker.get_execution_role()
role="arn:aws:iam::{}:role/service-role/AmazonSageMaker-ExecutionRole-20190118T115449".format(account_id)


In [2]:
pytorch_custom_image_name="ppi-extractor:cpu-1.0.0-202101050141"
instance_type = "ml.m5.large" 

In [3]:
docker_repo = "{}.dkr.ecr.{}.amazonaws.com/{}".format(account_id, region, pytorch_custom_image_name)

### Step 1: Convert pubtator format to inference json

The input pubtator files look like this.. These are converted to produce inference 

```text
20791654|a|Liver scan characteristics and liver function tests of 72 patients with proved hepatic malignancy (54 metastatic, 18 primary) were evaluated. Well-defined focal defects were observed in 83% of patients with metastatic and 77% of patients with primary liver carcinoma. In 10% of the patients with metastatic liver disease the distribution of radioactivity was normal. Four or more biochemical liver function tests were normal in 33% of metastatic and 29% of primary liver cancer patients. Hepatic enlargement was present in the scan in 94% of the patients with liver metastases; however, data obtained from 104 necropsies of patients with hepatic metastases showed that only 46% had hepatomegaly. We recommend, therefore, that a liver scan should be performed before major tumour surgery in every patient with known malignancy regardless of normal liver size or normal liver function tests.
20791654	58	66	patients	Species	9606
20791654	193	201	patients	Species	9606
20791654	229	237	patients	Species	9606
20791654	282	290	patients	Species	9606
20791654	478	486	patients	Species	9606
20791654	546	554	patients	Species	9606
20791654	624	632	patients	Species	9606
20791654	796	803	patient	Species	9606

20791817|a|5-Aminosalicylic acid given to rats as a single intravenous injection led to necrosis of the proximal convoluted tubules and of the renal papilla. These two lesions developed at the same time and the cortical lesions did not appear to be a consequence of the renal papillary necrosis. Since the compound possesses the molecular structure both of a phenacetin derivative and of a salicylate these observations may be relevant to the problem of renal damage incident to abuse of analgesic compounds and suggest the possibility that in this syndrome cortical lesions may develop independently of renal papillary necrosis.
20791817	31	35	rats	Species	10116

```

In [4]:
import datetime
date_fmt = datetime.datetime.today().strftime("%Y%m%d%H")

In [5]:
#s3_input_pubtator = "s3://aegovan-data/pubmed_json_parts_annotation_iseries/pubmed19n0550.json.txt"
s3_input_pubtator = "s3://aegovan-data/pubmed_json_parts_annotation_iseries/"
s3_id_mapping_file="s3://aegovan-data/settings/HUMAN_9606_idmapping.dat"

s3_output_pubmed_asbtract = f"s3://aegovan-data/pubmed_asbtract/inference_multi_{date_fmt}/"

In [6]:
from sagemaker.network import NetworkConfig
from sagemaker.processing import ProcessingInput, ProcessingOutput
from sagemaker.processing import ScriptProcessor

script_processor = ScriptProcessor(image_uri=docker_repo,
                                       command=["python"],
                                       env={'mode': 'python', 'PYTHONPATH':'/opt/ml/code'},
                                       role=role,
                                       instance_type=instance_type,
                                       instance_count=10,
                                       max_runtime_in_seconds=172800,
                                       volume_size_in_gb = 50,
                                       network_config=NetworkConfig(enable_network_isolation=False),
                                       base_job_name ="ppi-large-inference-data-prep"


                                       )


sm_local_input_pubtator_txt = "/opt/ml/processing/input/data/json"
sm_local_input_idmapping = "/opt/ml/processing/input/data/mapping"
sm_local_output = "/opt/ml/processing/output"


script_processor.run(
        code='source/datatransformer/pubtator_annotations_inference_transformer.py',

        arguments=[
        
            sm_local_input_pubtator_txt,
            sm_local_output,
           "{}/{}".format(sm_local_input_idmapping,s3_id_mapping_file.split("/")[-1]) 

        ],
    
       inputs=[
                ProcessingInput(
                    source=s3_input_pubtator,
                    destination=sm_local_input_pubtator_txt,
                    s3_data_distribution_type="ShardedByS3Key")

            ,ProcessingInput(
                    source=s3_id_mapping_file,
                    destination=sm_local_input_idmapping,
                    s3_data_distribution_type="FullyReplicated")
            ],

        outputs=[ProcessingOutput(
                source=sm_local_output, 
                destination=s3_output_pubmed_asbtract,
                output_name='inferenceabstracts')]
    )

Parameter 'session' will be renamed to 'sagemaker_session' in SageMaker Python SDK v2.



Job Name:  ppi-large-inference-data-prep-2021-01-05-03-07-48-497
Inputs:  [{'InputName': 'input-1', 'S3Input': {'S3Uri': 's3://aegovan-data/pubmed_json_parts_annotation_iseries/', 'LocalPath': '/opt/ml/processing/input/data/json', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'ShardedByS3Key', 'S3CompressionType': 'None'}}, {'InputName': 'input-2', 'S3Input': {'S3Uri': 's3://aegovan-data/settings/HUMAN_9606_idmapping.dat', 'LocalPath': '/opt/ml/processing/input/data/mapping', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-324346001917/ppi-large-inference-data-prep-2021-01-05-03-07-48-497/input/code/pubtator_annotations_inference_transformer.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}

[32m2021-01-05 03:12:45,100 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0412.json.txt with records 13774[0m
[32m2021-01-05 03:12:45,188 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0457.json.txt with records 2[0m
[34m2021-01-05 03:12:44,835 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0065.json.txt with records 5059[0m
[34m2021-01-05 03:12:45,294 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0091.json.txt with records 1334[0m
[34m2021-01-05 03:12:45,501 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0141.json.txt with records 25[0m
[34m2021-01-05 03:12:45,513 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0171.json.txt with records 4[0m
[36m2021-01-05 03:12:45,737 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0719.json.txt with records 15798[0

[36m2021-01-05 03:13:01,564 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0779.json.txt with records 15817[0m
[33m2021-01-05 03:13:02,025 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0970.json.txt with records 14003[0m
[35m2021-01-05 03:13:01,749 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0466.json.txt with records 3[0m
[32m2021-01-05 03:13:02,015 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0072.json.txt with records 6696[0m
[32m2021-01-05 03:13:02,377 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0287.json.txt with records 16998[0m
[36m2021-01-05 03:13:01,676 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0273.json.txt with records 6080[0m
[34m2021-01-05 03:13:02,582 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0862.json.txt with records 1

[32m2021-01-05 03:13:17,238 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0042.json.txt with records 3212[0m
[32m2021-01-05 03:13:17,430 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0147.json.txt with records 8[0m
[34m2021-01-05 03:13:16,784 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0045.json.txt with records 12675[0m
[33m2021-01-05 03:13:17,739 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0890.json.txt with records 11342[0m
[32m2021-01-05 03:13:18,391 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0843.json.txt with records 15839[0m
[35m2021-01-05 03:13:18,076 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0410.json.txt with records 14890[0m
[32m2021-01-05 03:13:17,684 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0227.json.txt with records 

[35m2021-01-05 03:13:42,610 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0811.json.txt with records 15221[0m
[32m2021-01-05 03:13:42,703 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0507.json.txt with records 15747[0m
[36m2021-01-05 03:13:43,599 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0874.json.txt with records 10730[0m
[34m2021-01-05 03:13:43,708 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0155.json.txt with records 11[0m
[34m2021-01-05 03:13:43,504 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0271.json.txt with records 7188[0m
[33m2021-01-05 03:13:43,646 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0094.json.txt with records 4133[0m
[34m2021-01-05 03:13:44,214 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0706.json.txt with records 

[34m2021-01-05 03:13:59,848 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0585.json.txt with records 15292[0m
[36m2021-01-05 03:14:00,824 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0699.json.txt with records 14519[0m
[35m2021-01-05 03:14:00,691 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0056.json.txt with records 13622[0m
[33m2021-01-05 03:14:02,067 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0339.json.txt with records 11999[0m
[33m2021-01-05 03:14:01,900 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0234.json.txt with records 949[0m
[33m2021-01-05 03:14:02,041 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0164.json.txt with records 32[0m
[32m2021-01-05 03:14:02,147 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0883.json.txt with records 

[36m2021-01-05 03:14:22,475 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0879.json.txt with records 14739[0m
[33m2021-01-05 03:14:22,743 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0539.json.txt with records 9863[0m
[35m2021-01-05 03:14:23,034 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0256.json.txt with records 19006[0m
[35m2021-01-05 03:14:22,732 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0801.json.txt with records 15017[0m
[32m2021-01-05 03:14:23,461 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0968.json.txt with records 10342[0m
[34m2021-01-05 03:14:23,598 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0692.json.txt with records 13456[0m
[33m2021-01-05 03:14:23,267 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0274.json.txt with reco

[36m2021-01-05 03:14:43,854 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0053.json.txt with records 4365[0m
[36m2021-01-05 03:14:44,286 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0298.json.txt with records 10651[0m
[33m2021-01-05 03:14:44,398 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0384.json.txt with records 3702[0m
[33m2021-01-05 03:14:45,191 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0174.json.txt with records 1[0m
[35m2021-01-05 03:14:45,096 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0837.json.txt with records 14380[0m
[36m2021-01-05 03:14:45,299 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0593.json.txt with records 17308[0m
[34m2021-01-05 03:14:45,254 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0411.json.txt with records 1

[36m2021-01-05 03:15:02,517 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0123.json.txt with records 719[0m
[32m2021-01-05 03:15:02,113 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0517.json.txt with records 16115[0m
[36m2021-01-05 03:15:02,288 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0819.json.txt with records 15014[0m
[35m2021-01-05 03:15:03,031 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0516.json.txt with records 14235[0m
[36m2021-01-05 03:15:03,106 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0704.json.txt with records 14407[0m
[33m2021-01-05 03:15:03,403 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0780.json.txt with records 16338[0m
[33m2021-01-05 03:15:03,317 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0284.json.txt with recor

[33m2021-01-05 03:15:20,873 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0449.json.txt with records 2[0m
[35m2021-01-05 03:15:20,656 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0566.json.txt with records 15704[0m
[32m2021-01-05 03:15:20,937 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0713.json.txt with records 13190[0m
[32m2021-01-05 03:15:21,082 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0047.json.txt with records 4570[0m
[33m2021-01-05 03:15:21,408 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0599.json.txt with records 17300[0m
[35m2021-01-05 03:15:22,466 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0821.json.txt with records 15048[0m
[32m2021-01-05 03:15:22,116 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0217.json.txt with records 

[34m2021-01-05 03:15:46,256 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0511.json.txt with records 17741[0m
[33m2021-01-05 03:15:47,057 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0965.json.txt with records 9908[0m
[35m2021-01-05 03:15:46,840 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0546.json.txt with records 13369[0m
[34m2021-01-05 03:15:47,381 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0025.json.txt with records 268[0m
[36m2021-01-05 03:15:47,176 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0308.json.txt with records 11296[0m
[33m2021-01-05 03:15:48,207 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0039.json.txt with records 302[0m
[34m2021-01-05 03:15:47,908 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0816.json.txt with records 

[35m2021-01-05 03:16:06,629 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0240.json.txt with records 376[0m
[32m2021-01-05 03:16:06,191 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0808.json.txt with records 13183[0m
[34m2021-01-05 03:16:06,548 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0115.json.txt with records 1009[0m
[34m2021-01-05 03:16:06,739 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0455.json.txt with records 2[0m
[36m2021-01-05 03:16:06,478 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0368.json.txt with records 14548[0m
[34m2021-01-05 03:16:06,222 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0591.json.txt with records 13953[0m
[33m2021-01-05 03:16:06,265 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0519.json.txt with records 14

[33m2021-01-05 03:16:22,805 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0194.json.txt with records 7[0m
[33m2021-01-05 03:16:23,202 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0935.json.txt with records 11085[0m
[36m2021-01-05 03:16:22,826 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0924.json.txt with records 11508[0m
[34m2021-01-05 03:16:24,000 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0111.json.txt with records 1202[0m
[33m2021-01-05 03:16:23,353 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0680.json.txt with records 16457[0m
[35m2021-01-05 03:16:23,923 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0016.json.txt with records 391[0m
[35m2021-01-05 03:16:24,443 - __main__ - INFO - Processed file /opt/ml/processing/input/data/json/pubmed19n0867.json.txt with records 14

## Step 2: Run predictions

In [11]:
prepare_models=False

In [12]:
jobs = [
"ppi-bert-2021-01-02-08-56-37-716",
"ppi-bert-2021-01-02-08-56-29-913",
"ppi-bert-2021-01-02-08-56-19-909",
"ppi-bert-2021-01-02-08-56-14-194",
"ppi-bert-2021-01-02-08-56-10-043",
"ppi-bert-2021-01-02-08-56-05-246",
"ppi-bert-2021-01-02-08-55-52-783",
"ppi-bert-2021-01-02-08-55-44-461",
"ppi-bert-2021-01-02-08-55-34-954",
"ppi-bert-2021-01-02-08-55-25-173"

]

s3_model_path_format = "s3://aegovan-data/results/{}/output/model.tar.gz"

s3_model_paths = [s3_model_path_format.format(j) for j in jobs]

In [13]:
s3_output_ensemble_models= "s3://aegovan-data/ensemble_models/{}".format("ppi-bert-2021-01-02-08")

### Prepare ensemble models
TODO: This is just a hack to untar a bunch of zipped models and upload them to a single s3 locaton. Have a single processing job to do this is an overkill...

In [None]:
def get_processing_inputs_s3_local_path(s3_model_paths, sm_local_input):
    # Map the s3 model path to local input path
    inputs = []
    for i, s3_path in enumerate(s3_model_paths):
         p = ProcessingInput(
                        source=s3_path,
                        destination="{}/{}".format(sm_local_input.rstrip("/"), i)
         )
         inputs.append(p)
    return inputs


In [None]:
from sagemaker.network import NetworkConfig
from sagemaker.processing import ProcessingInput, ProcessingOutput
from sagemaker.processing import ScriptProcessor


sm_local_input = "/opt/ml/processing/input/models"
sm_local_output = "/opt/ml/processing/output"

script_processor = ScriptProcessor(image_uri=docker_repo,
                                       command=["python"],
                                       env={'mode': 'python', 'PYTHONPATH':'/opt/ml/code'},
                                       role=role,
                                       instance_type=instance_type,
                                       instance_count=1,
                                       max_runtime_in_seconds=172800,
                                       volume_size_in_gb = 50,
                                       network_config=NetworkConfig(enable_network_isolation=False),
                                       base_job_name ="ppi-ensemble-model-packer"
                                       )


In [None]:

if prepare_models:
    # Work around to get over the processing job input limit size
    chunk_size=5
    for i in range(0, len(s3_model_paths), chunk_size ):

        script_processor.run(
                code='source/algorithms/ensemble_inference_prepare_models.py',

                arguments=[
                    "--input-dir",
                    sm_local_input,
                    "--dest-dir",
                    sm_local_output

                ],

                inputs=get_processing_inputs_s3_local_path(s3_model_paths[i:i+chunk_size], sm_local_input),


                outputs=[ProcessingOutput(
                        source=sm_local_output, 
                        destination=s3_output_ensemble_models,
                        output_name='models')]
            )



### Run ensemble prediction

In [7]:
s3_output_predictions = "s3://aegovan-data/pubmed_asbtract/predictions_multi_{}_{}/".format("ppi-bert-2021-01-02-08_m",date_fmt)

In [8]:
pytorch_custom_image_name="ppi-extractor:gpu-1.0.0-202101050141"
instance_type = "ml.p3.16xlarge" 

In [None]:
#temp
#s3_output_pubmed_asbtract = f"s3://aegovan-data/pubmed_asbtract/inference_multi_2020123123/"

In [9]:
docker_repo = "{}.dkr.ecr.{}.amazonaws.com/{}".format(account_id, region, pytorch_custom_image_name)

In [None]:
from sagemaker.network import NetworkConfig
from sagemaker.processing import ProcessingInput, ProcessingOutput
from sagemaker.processing import ScriptProcessor

script_processor = ScriptProcessor(image_uri=docker_repo,
                                       command=["python"],
                                       env={'mode': 'python', 'PYTHONPATH':'/opt/ml/code'},
                                       role=role,
                                       instance_type=instance_type,
                                       instance_count=4,
                                       max_runtime_in_seconds=172800,
                                       volume_size_in_gb = 250,
                                       network_config=NetworkConfig(enable_network_isolation=False),
                                       base_job_name ="ppi-ensemble-inference"
                                       )


sm_local_input_models = "/opt/ml/processing/input/data/models"
sm_local_input_data = "/opt/ml/processing/input/data/jsonlines"
sm_local_output = "/opt/ml/processing/output"



script_processor.run(
        code='source/algorithms/main_predict.py',

        arguments=[
            "PpiMulticlassDatasetFactory",
            sm_local_input_data,
            sm_local_input_models,
            sm_local_output
        ],

        inputs=[
                ProcessingInput(
                    source=s3_output_pubmed_asbtract,
                    destination=sm_local_input_data,
                    s3_data_distribution_type="ShardedByS3Key")

            ,ProcessingInput(
                    source=s3_output_ensemble_models,
                    destination=sm_local_input_models,
                    s3_data_distribution_type="FullyReplicated")
            ],


        outputs=[ProcessingOutput(
                source=sm_local_output, 
                destination=s3_output_predictions,
                output_name='predictions')]
    )




Parameter 'session' will be renamed to 'sagemaker_session' in SageMaker Python SDK v2.



Job Name:  ppi-ensemble-inference-2021-01-05-03-26-12-799
Inputs:  [{'InputName': 'input-1', 'S3Input': {'S3Uri': 's3://aegovan-data/pubmed_asbtract/inference_multi_2021010514/', 'LocalPath': '/opt/ml/processing/input/data/jsonlines', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'ShardedByS3Key', 'S3CompressionType': 'None'}}, {'InputName': 'input-2', 'S3Input': {'S3Uri': 's3://aegovan-data/ensemble_models/ppi-bert-2021-01-02-08', 'LocalPath': '/opt/ml/processing/input/data/models', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}, {'InputName': 'code', 'S3Input': {'S3Uri': 's3://sagemaker-us-east-2-324346001917/ppi-ensemble-inference-2021-01-05-03-26-12-799/input/code/main_predict.py', 'LocalPath': '/opt/ml/processing/input/code', 'S3DataType': 'S3Prefix', 'S3InputMode': 'File', 'S3DataDistributionType': 'FullyReplicated', 'S3CompressionType': 'None'}}]
Outputs:  [{'OutputName': '