# TAPAS deployment via Sagemaker-Neuron

## Overview

This notebook creates an instance of ```TAPAS_Deployer``` and calls all neccessary actions to build, deploy, and test a mini variant of TAPAS for tabular question answering. For details, please refer to the source files included in ```./source``` and ```./entrypoint``` which were refactored to be easy to read.

## How to use this notebook..
- Create an AWS account.
- Create an IAM role with the following access permissions: ```AmazonSageMakerFullAccess, EC2InstanceProfileForImageBuilderECRContainerBuilds, AWSAppRunnerServicePolicyForECRAccess```
- Start a new Notebook instance in Sagemaker using the role created above.
- Clone this repository and run this notebook.

## Some notes for Scrub..
- ```Deployer``` is a generic class template from which many models can be built and deployed directly.
- ```TAPAS_Deployer``` inherits Deployer and any other model can be similarly created with minimum effort.
- To avoid timeouts and and random kernel restarts, the running code is separated from the noteboook running it. 
- Everything in ```./source``` can be easily imported as an API.
- Some integration pytest samples are included in ```./tests```

### Install local dependencies

In [1]:
!pip install --upgrade --no-cache-dir torch-neuron neuron-cc[tensorflow] torchvision torch torch-scatter --extra-index-url=https://pip.repos.neuron.amazonaws.com
!pip install --upgrade --no-cache-dir 'transformers==4.6.0'

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com, https://pip.repos.neuron.amazonaws.com
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com


### Prepare deployer

In [2]:
from source.tapas import TAPAS_Deployer
tapas_deployer = TAPAS_Deployer()

### Retrieve model from Huggingface Hub and prepare its respective tokeniser

In [3]:
tapas_deployer.get_model_and_tokeniser()

### Trace the model to be deployed into a Neuron instance

In [4]:
tapas_deployer.trace_model()

  self.indices = torch.as_tensor(indices)
  self.num_segments = torch.as_tensor(num_segments, device=indices.device)
  batch_size = torch.prod(torch.tensor(list(index.batch_shape())))
  batch_size = torch.prod(torch.tensor(list(index.batch_shape())))
  [torch.as_tensor([-1], dtype=torch.long), torch.as_tensor(vector_shape, dtype=torch.long)], dim=0
  flat_values = values.reshape(flattened_shape.tolist())
  torch.as_tensor(index.batch_shape(), dtype=torch.long),
  torch.as_tensor(index.batch_shape(), dtype=torch.long),
  torch.as_tensor([index.num_segments], dtype=torch.long),
  torch.as_tensor([index.num_segments], dtype=torch.long),
  torch.as_tensor(vector_shape, dtype=torch.long),
  output_values = segment_means.view(new_shape.tolist())
  batch_shape, dtype=torch.long
  batch_shape, dtype=torch.long
  num_segments = torch.as_tensor(num_segments)  # create a rank 0 tensor (scalar) containing num_segments (e.g. 64)
  new_shape = [int(x) for x in new_tensor.tolist()]
  multiples = torc

06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=8 blocks=1 instructions=8
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:17 2023
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:17 2023
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Total count: 26
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Save: 13
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: TensorCopy: 8
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Memset: 2
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: Load: 1
06/11/2023 03:06:17 PM INFO [WalrusDriver.0]: ru_maxrss:  849mb (delta=0mb)
06/11/2023 03:06:17 PM INFO

INFO:Neuron:Compiling function _NeuronGraph$294 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/1/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/1/graph_def.neff --io-config {"inputs": {}, "outputs": ["TapasModel_7/TapasEmbeddings_27/prim_Constant/Const:0"]} --verbose 1'
06/11/2023 03:06:18 PM INFO 32742 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/1/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/1/graph_def.neff --io-config '{"inputs": {}, "outputs": ["TapasModel_7/TapasEmbeddings_27/prim_Constant/Const:0"]}' --verbose 1
06/11/2023 03:06:19 PM INFO 32742 [root]: Intermediate fil

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where


INFO:Neuron:Compiling function _NeuronGraph$295 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/18/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/18/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:21 PM INFO 349 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/18/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/18/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"

06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:24 2023
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:24 2023
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: ru_maxrss:  881mb (delta=0mb)
06/11/2023 03:06:24 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$296 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/24/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/24/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:25 PM INFO 479 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/24/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/24/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"

06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:28 2023
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:28 2023
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: ru_maxrss:  881mb (delta=0mb)
06/11/2023 03:06:28 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$297 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/30/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/30/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:29 PM INFO 610 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/30/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/30/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"

06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:33 2023
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:33 2023
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: ru_maxrss:  881mb (delta=0mb)
06/11/2023 03:06:33 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$298 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/36/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/36/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:33 PM INFO 742 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/36/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/36/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"

06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:37 2023
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:37 2023
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 03:06:37 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$299 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/42/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/42/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:38 PM INFO 877 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/42/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/42/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"

06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:41 2023
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:41 2023
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 03:06:41 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$300 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/48/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/48/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:42 PM INFO 1009 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/48/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/48/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0

06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:45 2023
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:45 2023
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 03:06:45 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$301 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/54/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/54/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0"]} --verbose 1'
06/11/2023 03:06:46 PM INFO 1142 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/54/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/54/graph_def.neff --io-config '{"inputs": {"tensor.1:0": [[3, 512, 7], "int64"]}, "outputs": ["TapasModel_7/TapasEmbeddings_27/aten_select/Reshape:0

06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=4
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:06:50 2023
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:06:50 2023
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Total count: 32
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Load: 12
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: TensorCopy: 6
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 03:06:50 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11

INFO:Neuron:Compiling function _NeuronGraph$302 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/60/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/60/graph_def.neff --io-config {"inputs": {}, "outputs": ["prim_Constant/Const:0"]} --verbose 1'
06/11/2023 03:06:51 PM INFO 1270 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/60/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/60/graph_def.neff --io-config '{"inputs": {}, "outputs": ["prim_Constant/Const:0"]}' --verbose 1
06/11/2023 03:06:51 PM INFO 1270 [root]: Intermediate files stored in /home/ec2-user/SageMaker/ssh/scrub/compilation_ar

06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=16 blocks=1 instructions=11
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:08 2023
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:08 2023
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Total count: 117
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: TensorScalarPtr: 48
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Load: 25
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: Save: 24
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: TensorCopy: 14
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: TensorScalar: 6
06/11/2023 03:07:08 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 0

06/11/2023 03:07:08 PM INFO 1726 [job.Kelper.2]: neuroncc version is 1.15.0.0+eec0c3604, neff version is 1.0 (features 0)
06/11/2023 03:07:08 PM INFO 1726 [job.Kelper.2]: wrote /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/100/graph_def.neff
06/11/2023 03:07:08 PM INFO 1726 [pipeline.compile.0]: Finished job job.Kelper.2 with state 0
06/11/2023 03:07:08 PM INFO 1726 [pipeline.compile.0]: Finished pipeline compile
06/11/2023 03:07:08 PM INFO 1726 [pipeline.compile.0]: Job finished
06/11/2023 03:07:08 PM INFO 1726 [pipeline.custom.0]: Finished job pipeline.compile.0 with state 0
06/11/2023 03:07:08 PM INFO 1726 [pipeline.custom.0]: Starting job job.SaveTemps.0 state state 0
06/11/2023 03:07:08 PM INFO 1726 [pipeline.custom.0]: Finished job job.SaveTemps.0 with state 0
06/11/2023 03:07:08 PM INFO 1726 [pipeline.custom.0]: Finished pipeline custom
06/11/2023 03:07:08 PM INFO 1726 [pipeline.custom.0]: Job finished
06/11/2023 03:07:08 PM INFO 1726 [root]: Compiler status PASS
INFO

06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=10 blocks=1 instructions=5
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:12 2023
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:12 2023
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Total count: 3872
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Save: 1548
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Load: 1548
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: TensorCopy: 582
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: TensorScalar: 194
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: ru_maxrss:  882mb (delta=0mb)
06/11/2023 03:07:12 PM INFO [WalrusDriver.0]: Walrus pass: unroll succe

Analyzing dependencies of sg00/Block1
0%   10   20   30   40   50   60   70   80   90   100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************


06/11/2023 03:07:13 PM INFO [Stargazer.0]: [Sailfish] Data race analysis found no races, run time: 0:00:00
06/11/2023 03:07:13 PM INFO [Stargazer.0]: [Sailfish] Remove redundant edges
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Data race checker engines
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Transitive reduction start 
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Transitive reduction removed 2 redundant edges, time: 0:00:00
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Sync Critical Load Chains Start
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Sync Critical Load Chains added 0 new Load-2-Load syncs
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Sync Critical Load Chains Done.0:00:00
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Out wavegraph bin file is wavegraph-bin.json
06/11/2023 03:07:13 PM INFO [Stargazer.0]: Writing NN JSON to file 'wavegraph-bin.json'
06/11/2023 03:07:14 PM INFO [Stargazer.0]: Virtual memory peak = 4390720 K bytes
06/11/2023 03:07:14 PM INFO [Stargazer.0]: PASSED - Total 

06/11/2023 03:07:14 PM INFO 1873 [job.WalrusDriver.3]: IR signature: 8d028021dc438232e84a544fcb9dfb70718ec7f96cacb52227c22bdcc522c7c5 for sg00/walrus_bir.out.json
06/11/2023 03:07:14 PM INFO 1873 [job.WalrusDriver.3]: Job finished
06/11/2023 03:07:14 PM INFO 1873 [pipeline.compile.0]: Finished job job.WalrusDriver.3 with state 0
06/11/2023 03:07:14 PM INFO 1873 [pipeline.compile.0]: Starting job job.Backend.3 state state 0
06/11/2023 03:07:14 PM INFO 1873 [job.Backend.3]: Replay this job by calling: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile --framework TENSORFLOW --state '{"model": ["/home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/102/graph_def.pb"], "tensormap": "tensor_map.json", "bir": "walrus_bir.out.json", "state_dir": "/home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/102/sg00", "state_id": "sg00"}' --pipeline Backend --enable-experimental-bir-backend
06/11/2023 03:07:14 PM INFO 1873 [job.Backend.3]: IR signature: d3e22f255d64

06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=10 blocks=1 instructions=6
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:18 2023
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:18 2023
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Total count: 61
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: TensorScalarPtr: 24
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Load: 13
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Save: 12
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: TensorCopy: 8
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: TensorScalar: 2
06/11/2023 03:07:18 PM INFO [WalrusDriver.0]: Memset: 2
06/11/2023 03:07:18 PM INFO [Walrus

INFO:Neuron:Compiling function _NeuronGraph$311 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/106/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/106/graph_def.neff --io-config {"inputs": {"tensor.1:0": [[3, 512], "int64"], "1:0": [[3, 512, 256], "float32"], "2:0": [[3, 512, 256], "float32"], "3:0": [[3, 512, 256], "float32"], "4:0": [[3, 512, 256], "float32"], "5:0": [[3, 512, 256], "float32"], "6:0": [[3, 512, 256], "float32"], "7:0": [[3, 512, 256], "float32"], "8:0": [[3, 512, 256], "float32"], "9:0": [[3, 512, 256], "float32"], "tensor.9:0": [[3, 512, 7], "int64"], "tensor.25:0": [[], "int64"], "tensor.39:0": [[], "int64"], "tensor.59:0": [[], "int64"], "tensor.75:0": [[], "int64"]}, "outputs": ["TapasModel_7/TapasPooler_29/Tanh_11/aten_tanh/Tanh:0", "at

06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=2
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:29 2023
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:29 2023
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Total count: 151
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Shuffle: 96
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: TensorCopy: 48
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Save: 6
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Load: 1
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: ru_maxrss:  1604mb (delta=0mb)
06/11/2023 03:07:29 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11/20

INFO:Neuron:Compiling function _NeuronGraph$313 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/114/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/114/graph_def.neff --io-config {"inputs": {"0:0": [[6144], "float32"]}, "outputs": ["aten_zeros/zeros:0"]} --verbose 1'
06/11/2023 03:07:30 PM INFO 2490 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/114/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/114/graph_def.neff --io-config '{"inputs": {"0:0": [[6144], "float32"]}, "outputs": ["aten_zeros/zeros:0"]}' --verbose 1
06/11/2023 03:07:30 PM INFO 2490 [root]: Intermediate files stored in

06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=2
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:33 2023
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:33 2023
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Total count: 151
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Shuffle: 96
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: TensorCopy: 48
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Save: 6
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Load: 1
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: ru_maxrss:  1604mb (delta=0mb)
06/11/2023 03:07:33 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11/20

INFO:Neuron:Compiling function _NeuronGraph$314 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/121/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/121/graph_def.neff --io-config {"inputs": {"0:0": [[6144], "float32"]}, "outputs": ["aten_zeros/zeros:0"]} --verbose 1'
06/11/2023 03:07:34 PM INFO 2605 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/121/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/121/graph_def.neff --io-config '{"inputs": {"0:0": [[6144], "float32"]}, "outputs": ["aten_zeros/zeros:0"]}' --verbose 1
06/11/2023 03:07:34 PM INFO 2605 [root]: Intermediate files stored in

06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=5 blocks=1 instructions=2
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:36 2023
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:36 2023
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Total count: 151
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Shuffle: 96
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: TensorCopy: 48
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Save: 6
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Load: 1
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: ru_maxrss:  1604mb (delta=0mb)
06/11/2023 03:07:36 PM INFO [WalrusDriver.0]: Walrus pass: unroll succeeded!
06/11/20

INFO:Neuron:Compiling function _NeuronGraph$315 with neuron-cc
INFO:Neuron:Compiling with command line: '/home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/127/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /home/ec2-user/SageMaker/ssh/scrub/compilation_artifacts/127/graph_def.neff --io-config {"inputs": {"0:0": [[6144], "float32"], "1:0": [[6144], "float32"], "2:0": [[6144], "float32"], "3:0": [[6144], "float32"], "tensor.1:0": [[3, 2048], "int64"], "tensor.7:0": [[], "int64"], "tensor.9:0": [[], "int64"], "tensor.23:0": [[], "int64"]}, "outputs": ["aten_view/Reshape:0", "aten_reshape/Reshape:0", "aten_expand_2/Cast_1:0", "aten_zeros/zeros:0", "Identity:0", "aten_reshape_1/Reshape:0", "aten_expand_3/Cast_1:0", "aten_zeros_1/zeros:0"]} --verbose 1'
06/11/2023 03:07:37 PM INFO 2728 [root]: /home/ec2-user/anaconda3/envs/amazonei_pytorch_latest_p37/bin/neuron-cc compile /home/e

06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: max_allowed_parallelism=24
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Running walrus pass: unroll
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Input to unroll: modules=1 functions=1 allocs=26 blocks=1 instructions=20
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: INFO (Unroll) Start unrolling at Sun Jun 11 15:07:46 2023
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: INFO (Unroll) DONE unrolling Sun Jun 11 15:07:46 2023
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Instruction count after Unroll: 
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Total count: 36
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Matmult: 19
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: TensorCopy: 8
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Load: 6
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: Save: 2
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: TensorScalarPtr: 1
06/11/2023 03:07:46 PM INFO [WalrusDriver.0]: ru_maxrss:  1605mb (delta=0mb)
06/11/2023 03:07:46 

INFO:Neuron:Number of arithmetic operators (post-compilation) before = 582, compiled = 86, percent compiled = 14.78%
INFO:Neuron:The neuron partitioner created 25 sub-graphs
INFO:Neuron:Neuron successfully compiled 15 sub-graphs, Total fused subgraphs = 25, Percent of model sub-graphs successfully compiled = 60.0%
INFO:Neuron:Compiled these operators (and operator counts) to Neuron:
INFO:Neuron: => aten::Int: 11
INFO:Neuron: => aten::ScalarImplicit: 3
INFO:Neuron: => aten::add: 2
INFO:Neuron: => aten::arange: 3
INFO:Neuron: => aten::expand: 1
INFO:Neuron: => aten::linear: 1
INFO:Neuron: => aten::min: 1
INFO:Neuron: => aten::mul: 3
INFO:Neuron: => aten::reshape: 1
INFO:Neuron: => aten::select: 9
INFO:Neuron: => aten::size: 11
INFO:Neuron: => aten::slice: 18
INFO:Neuron: => aten::sub: 1
INFO:Neuron: => aten::to: 9
INFO:Neuron: => aten::unsqueeze: 3
INFO:Neuron: => aten::view: 6
INFO:Neuron: => aten::zeros: 3
INFO:Neuron:Not compiled operators (and operator counts) to Neuron:
INFO:Neuron:

graph(%self.1 : __torch__.torch_neuron.runtime.___torch_mangle_229.AwsNeuronGraphModule,
      %63 : Long(3, 512, strides=[512, 1], requires_grad=0, device=cpu),
      %tensor.1 : Long(3, 512, strides=[512, 1], requires_grad=0, device=cpu),
      %tensor.9 : Long(3, 512, 7, strides=[3584, 7, 1], requires_grad=0, device=cpu)):
  %_NeuronGraph#133 : __torch__.torch_neuron.decorators.___torch_mangle_228.NeuronModuleV2 = prim::GetAttr[name="_NeuronGraph#133"](%self.1)
  %_NeuronGraph#121 : __torch__.torch_neuron.decorators.___torch_mangle_227.NeuronModuleV2 = prim::GetAttr[name="_NeuronGraph#121"](%self.1)
  %_NeuronGraph#114 : __torch__.torch_neuron.decorators.___torch_mangle_226.NeuronModuleV2 = prim::GetAttr[name="_NeuronGraph#114"](%self.1)
  %_NeuronGraph#108 : __torch__.torch_neuron.decorators.___torch_mangle_225.NeuronModuleV2 = prim::GetAttr[name="_NeuronGraph#108"](%self.1)
  %_NeuronGraph#104 : __torch__.torch_neuron.decorators.___torch_mangle_224.NeuronModuleV2 = prim::GetAttr[n

### Upload the traced model into S3

In [5]:
tapas_deployer.upload_model_to_s3()

neuron_compiled_model.pt
Uploaded model to S3: s3://sagemaker-eu-north-1-058095970122/inf1_compiled_model/model/model.tar.gz


### Build the docker image that will serve as the hosting environment of the deployed model
To see all the instructions used to build the image, check the Dockerfile at ```./Dockerfile```

In [6]:
tapas_deployer.build_ecr_image()

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



Login Succeeded
Sending build context to Docker daemon  359.9MB
Step 1/4 : FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference-neuron:1.7.1-neuron-py36-ubuntu18.04
 ---> 388bfe7d2429
Step 2/4 : RUN pip install "pandas==1.1.5"
 ---> Using cache
 ---> b0e18d821bee
Step 3/4 : RUN pip install --upgrade --no-cache-dir torch-neuron neuron-cc[tensorflow] torchvision torch torch-scatter --extra-index-url=https://pip.repos.neuron.amazonaws.com
 ---> Using cache
 ---> 1b39d27c2e1c
Step 4/4 : RUN pip install --upgrade --no-cache-dir 'transformers==4.6.0'
 ---> Using cache
 ---> 09efc9662482
Successfully built 09efc9662482
Successfully tagged inference-to-deploy:latest


https://docs.docker.com/engine/reference/commandline/login/#credentials-store



Login Succeeded
The push refers to repository [058095970122.dkr.ecr.eu-north-1.amazonaws.com/inference-to-deploy]
abc090cd2ab4: Preparing
46af0ecae94b: Preparing
7e57f08b8a1d: Preparing
5d32c72ad027: Preparing
f4d9427d752b: Preparing
62b8cb6215cb: Preparing
877b36f2f41c: Preparing
4beb4e23ce0b: Preparing
629f76e0ebfa: Preparing
55aafc4b4134: Preparing
5b505b65f8e8: Preparing
4383d9750962: Preparing
120ef6a75dae: Preparing
d1a63e051735: Preparing
1353ff378dc3: Preparing
3d7573db3c3f: Preparing
2858c813c4e4: Preparing
629f76e0ebfa: Waiting
e8b427e8fb51: Preparing
877b36f2f41c: Waiting
6b9a1856b2e9: Preparing
79ec63999885: Preparing
4beb4e23ce0b: Waiting
5dcdeb94f6a5: Preparing
b8b74f1e44f0: Preparing
bf9a431aeda6: Preparing
5f08512fd434: Preparing
c7bb31fc0e08: Preparing
62b8cb6215cb: Waiting
55aafc4b4134: Waiting
50858308da3d: Preparing
5b505b65f8e8: Waiting
3d7573db3c3f: Waiting
e8b427e8fb51: Waiting
4383d9750962: Waiting
bf9a431aeda6: Waiting
79ec63999885: Waiting
6b9a1856b2e9: Waitin

### Deploy the built environment using the entrypoint ```./entrypoint/inference.py``` to define how the image starts and how it reacts to queries

In [None]:
tapas_deployer.deploy_ecr_image()

--------

### Test endpoint

In [None]:
print(tapas_deployer.test_endpoint())

### Delete the endpoint after testing it

In [None]:
tapas_deployer.terminate()

###  Final notes for Scrub ..
- This deployer successfully builds and deploys CPU and Neuron instances.
- If run on an ```inf1``` instance, the deployer will test entrypoints locally to make sure CPU and Neuron inference work as expected in the deployed endpoints.
- The Neuron deployer works as expected when testing with classic BERT models.
- Specifically for TAPAS, the tracing step always returns the following warning for TAPAS mini:
```
WARNING:Neuron:torch.neuron.trace was unable to compile > 50% of the operators in the compiled model!
WARNING:Neuron:Please review the torch.neuron.analyze_model output and if you believe you are seeing a failure
WARNING:Neuron:Lodge an issue on https://github.com/aws/aws-neuron-sdk/issues if you believe the model is not compiling as expected
```
- The warning above means that traced TAPAS models randomly crash with "Unkown Reasons" when used for inference.
- Using the API included here, other BERT models work well during Neuron deployment and inference.
- The neuron service will always try running predictions through Neuron models first, and will fall back on the CPU if the neuron model acts funny.
- Seems like the randomness of TAPAS Neuron tracing would take a fair bit of time to resolve, so I am including a typical Neuron deployment build here with CPU fallback.

Thanks for the clear test and please let me know if you have any questions.