model_neuron.trace( ... model=model, ... num_texts=num_texts, ... num_beams=num_beams, ... max_encoder_length=max_encoder_length, ... max_decoder_length=max_decoder_length, ... ) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:716: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert key_padding_mask is None or key_padding_mask.shape == (bsz, src_len) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:718: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert attn_weights.size() == (bsz * self.num_heads, tgt_len, src_len) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:736: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert attn_output.size() == (bsz * self.num_heads, tgt_len, self.head_dim) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:287: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! if torch.isinf(x).any() or torch.isnan(x).any(): INFO:Neuron:There are 2 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/compiler/neuron-cc/neuron-cc-ops/neuron-cc-ops-pytorch.html) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 431, fused = 423, percent fused = 98.14% /root/pytorch_venv/lib64/python3.7/site-packages/torch/tensor.py:593: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). 'incorrect results).', category=RuntimeWarning) WARNING:tensorflow:From /root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/ops/aten.py:1703: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.where in 2.0, which has the same broadcast rule as np.where INFO:Neuron:Compiling function _NeuronGraph$200 with neuron-cc INFO:Neuron:Compiling with command line: '/root/pytorch_venv/bin/neuron-cc compile /tmp/tmpp1r0j8z4/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpp1r0j8z4/graph_def.neff --io-config {"inputs": {"0:0": [[1, 32, 512], "float32"], "1:0": [[1, 32], "bool"]}, "outputs": ["BartEncoder_1/aten_transpose_1/transpose:0"]} --verbose 35' ... Compiler status PASS INFO:Neuron:Number of arithmetic operators (post-compilation) before = 431, compiled = 423, percent compiled = 98.14% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 1 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 100.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 78 INFO:Neuron: => aten::add: 12 INFO:Neuron: => aten::add_: 36 INFO:Neuron: => aten::bmm: 12 INFO:Neuron: => aten::contiguous: 24 INFO:Neuron: => aten::dropout: 25 INFO:Neuron: => aten::layer_norm: 12 INFO:Neuron: => aten::masked_fill: 6 INFO:Neuron: => aten::matmul: 36 INFO:Neuron: => aten::mul: 30 INFO:Neuron: => aten::silu: 6 INFO:Neuron: => aten::size: 24 INFO:Neuron: => aten::softmax: 6 INFO:Neuron: => aten::t: 36 INFO:Neuron: => aten::transpose: 32 INFO:Neuron: => aten::unsqueeze: 12 INFO:Neuron: => aten::view: 36 INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::ScalarImplicit: 1 [supported] INFO:Neuron: => aten::add: 1 [supported] INFO:Neuron: => aten::arange: 1 [supported] INFO:Neuron: => aten::embedding: 2 [not supported] INFO:Neuron: => aten::eq: 1 [supported] INFO:Neuron: => aten::mul: 1 [supported] INFO:Neuron: => aten::size: 1 [supported] /root/common/wrapper.py:63: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect. causal_mask = torch.tensor(mask, dtype=torch.float) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:718: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert attn_weights.size() == (bsz * self.num_heads, tgt_len, src_len) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:736: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert attn_output.size() == (bsz * self.num_heads, tgt_len, self.head_dim) /root/pytorch_venv/lib64/python3.7/site-packages/transformers/models/bart/modeling_bart.py:716: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constantin the future. This means that the trace might not generalize to other inputs! assert key_padding_mask is None or key_padding_mask.shape == (bsz, src_len) INFO:Neuron:There are 2 ops of 1 different types in the TorchScript that are not compiled by neuron-cc: aten::embedding, (For more information see https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/compiler/neuron-cc/neuron-cc-ops/neuron-cc-ops-pytorch.html) INFO:Neuron:Number of arithmetic operators (pre-compilation) before = 787, fused = 782, percent fused = 99.36% /root/pytorch_venv/lib64/python3.7/site-packages/torch/tensor.py:593: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results). 'incorrect results).', category=RuntimeWarning) INFO:Neuron:Compiling function _NeuronGraph$510 with neuron-cc INFO:Neuron:Compiling with command line: '/root/pytorch_venv/bin/neuron-cc compile /tmp/tmp8y21m91b/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp8y21m91b/graph_def.neff --io-config {"inputs": {"0:0": [[4, 32], "int64"], "1:0": [[4, 32, 512], "float32"], "2:0": [[32, 512], "float32"], "3:0": [[4, 32, 512], "float32"], "tensor.70:0": [[], "int64"]}, "outputs": ["aten_add/add:0"]} --verbose 35' ...02/16/2023 03:12:48 PM ERROR 9070 [IRVerifier]: Tensor has undef value TongaSB partitions[0] uint8 %BartDecoder_9/DecoderLayer_33/Attention_14/aten_unsqueeze_1/ExpandDims:0[32, 4] 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: *************************************************************** 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: An Internal Compiler Error has occurred 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: *************************************************************** 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Error message: Incorrect IR by 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Error class: AssertionError 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Error location: Unknown 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Command line: /root/pytorch_venv/bin/neuron-cc compile /tmp/tmp8y21m91b/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp8y21m91b/graph_def.neff --io-config '{"inputs": {"0:0": [[4, 32], "int64"], "1:0": [[4, 32, 512], "float32"], "2:0": [[32, 512], "float32"], "3:0": [[4, 32, 512], "float32"], "tensor.70:0": [[], "int64"]}, "outputs": ["aten_add/add:0"]}' --verbose 35 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Internal details: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/CommandDriver.py", line 224, in neuroncc.driver.CommandDriver.CommandDriver.run 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/commands/CompileCommand.py", line 576, in neuroncc.driver.commands.CompileCommand.CompileCommand.run 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/commands/CompileCommand.py", line 554, in neuroncc.driver.commands.CompileCommand.CompileCommand.runPipeline 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/commands/CompileCommand.py", line 558, in neuroncc.driver.commands.CompileCommand.CompileCommand.runPipeline 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/Job.py", line 289, in neuroncc.driver.Job.SingleInputJob.run 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/Pipeline.py", line 30, in neuroncc.driver.Pipeline.Pipeline.runSingleInput 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/Job.py", line 289, in neuroncc.driver.Job.SingleInputJob.run 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/Pipeline.py", line 30, in neuroncc.driver.Pipeline.Pipeline.runSingleInput 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/Job.py", line 289, in neuroncc.driver.Job.SingleInputJob.run 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/jobs/Frontend.py", line 427, in neuroncc.driver.jobs.Frontend.Frontend.runSingleInput 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/jobs/Frontend.py", line 377, in neuroncc.driver.jobs.Frontend.Frontend.runTVMFrontend 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/jobs/Frontend.py", line 378, in neuroncc.driver.jobs.Frontend.Frontend.runTVMFrontend 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/jobs/Frontend.py", line 388, in neuroncc.driver.jobs.Frontend.Frontend.runTVMFrontend 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/driver/jobs/Frontend.py", line 358, in neuroncc.driver.jobs.Frontend.Frontend.runTVMFrontend.tensorize 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/Frontend.py", line 47, in neuroncc.starfish.penguin.Frontend.tensorizeRelay 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/Frontend.py", line 48, in neuroncc.starfish.penguin.Frontend.tensorizeRelay 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/Frontend.py", line 69, in neuroncc.starfish.penguin.Frontend.tensorizeRelay 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/Compile.py", line 262, in neuroncc.starfish.penguin.Compile.compile_cu 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 378, in neuroncc.starfish.penguin.DotTransform.PassManager.transformFunction 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 124, in neuroncc.starfish.penguin.DotTransform.DotTransform.runOnFunction 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 178, in neuroncc.starfish.penguin.DotTransform.DotTransform.run_with_exception_handling 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 160, in neuroncc.starfish.penguin.DotTransform.DotTransform.run_with_exception_handling 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 186, in neuroncc.starfish.penguin.DotTransform.DotTransform.timed_run_ 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 188, in neuroncc.starfish.penguin.DotTransform.DotTransform.timed_run_ 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 189, in neuroncc.starfish.penguin.DotTransform.DotTransform.timed_run_ 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 206, in neuroncc.starfish.penguin.DotTransform.DotTransform.run_ 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 207, in neuroncc.starfish.penguin.DotTransform.DotTransform.run_ 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 289, in neuroncc.starfish.penguin.DotTransform.DotTransform.transformFunction 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/targets/tonga/Tonga.py", line 352, in neuroncc.starfish.penguin.targets.tonga.Tonga.TongaLowering.verify 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: File "neuroncc/starfish/penguin/DotTransform.py", line 275, in neuroncc.starfish.penguin.DotTransform.DotTransform.verify 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Version information: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Neuron Compiler version 1.13.5.0+7dcf000a6 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: HWM version 1.13.0.0-0 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: NEFF version Dynamic 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: TVM version 1.13.0.0+0 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: NumPy version 1.18.5 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: MXNet not available 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: TF not available 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: 02/16/2023 03:12:49 PM ERROR 9070 [neuron-cc]: Artifacts stored in: /tmp/tmp8y21m91b Compiler status ERROR INFO:Neuron:Compile command returned: 1 WARNING:Neuron:torch.neuron.trace failed on _NeuronGraph$510; falling back to native python function call ERROR:Neuron:neuron-cc failed with the following command line call: /root/pytorch_venv/bin/neuron-cc compile /tmp/tmp8y21m91b/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp8y21m91b/graph_def.neff --io-config '{"inputs": {"0:0": [[4, 32], "int64"], "1:0": [[4, 32, 512], "float32"], "2:0": [[32, 512], "float32"], "3:0": [[4, 32, 512], "float32"], "tensor.70:0": [[], "int64"]}, "outputs": ["aten_add/add:0"]}' --verbose 35 Traceback (most recent call last): File "/root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 392, in op_converter item, inputs, compiler_workdir=sg_workdir, **kwargs) File "/root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/decorators.py", line 229, in trace 'neuron-cc failed with the following command line call:\n{}'.format(command)) subprocess.SubprocessError: neuron-cc failed with the following command line call: /root/pytorch_venv/bin/neuron-cc compile /tmp/tmp8y21m91b/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmp8y21m91b/graph_def.neff --io-config '{"inputs": {"0:0": [[4, 32], "int64"], "1:0": [[4, 32, 512], "float32"], "2:0": [[32, 512], "float32"], "3:0": [[4, 32, 512], "float32"], "tensor.70:0": [[], "int64"]}, "outputs": ["aten_add/add:0"]}' --verbose 35 INFO:Neuron:Number of arithmetic operators (post-compilation) before = 787, compiled = 0, percent compiled = 0.0% INFO:Neuron:The neuron partitioner created 1 sub-graphs INFO:Neuron:Neuron successfully compiled 0 sub-graphs, Total fused subgraphs = 1, Percent of model sub-graphs successfully compiled = 0.0% INFO:Neuron:Compiled these operators (and operator counts) to Neuron: INFO:Neuron:Not compiled operators (and operator counts) to Neuron: INFO:Neuron: => aten::Int: 156 [supported] INFO:Neuron: => aten::ScalarImplicit: 2 [supported] INFO:Neuron: => aten::add: 24 [supported] INFO:Neuron: => aten::add_: 62 [supported] INFO:Neuron: => aten::arange: 2 [supported] INFO:Neuron: => aten::bmm: 24 [supported] INFO:Neuron: => aten::contiguous: 48 [supported] INFO:Neuron: => aten::detach: 1 [supported] INFO:Neuron: => aten::dropout: 37 [supported] INFO:Neuron: => aten::embedding: 2 [not supported] INFO:Neuron: => aten::eq: 2 [supported] INFO:Neuron: => aten::layer_norm: 18 [supported] INFO:Neuron: => aten::masked_fill: 6 [supported] INFO:Neuron: => aten::matmul: 61 [supported] INFO:Neuron: => aten::mul: 62 [supported] INFO:Neuron: => aten::silu: 6 [supported] INFO:Neuron: => aten::size: 50 [supported] INFO:Neuron: => aten::softmax: 12 [supported] INFO:Neuron: => aten::sum: 1 [supported] INFO:Neuron: => aten::t: 61 [supported] INFO:Neuron: => aten::to: 1 [supported] INFO:Neuron: => aten::transpose: 63 [supported] INFO:Neuron: => aten::unsqueeze: 13 [supported] INFO:Neuron: => aten::view: 73 [supported] Traceback (most recent call last): File "", line 6, in File "/root/common/wrapper.py", line 123, in trace self.decoder = torch_neuron.trace(decoder, inputs) File "/root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 195, in trace cu.stats_post_compiler(neuron_graph) File "/root/pytorch_venv/lib64/python3.7/site-packages/torch_neuron/convert.py", line 503, in stats_post_compiler "No operations were successfully partitioned and compiled to neuron for this model - aborting trace!") RuntimeError: No operations were successfully partitioned and compiled to neuron for this model - aborting trace!