saved_model_aot_compile.py removes unwanted tensors from signature #54296

amirjamez · 2022-02-08T02:27:45Z

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 18.04
TensorFlow installed from (source or binary): Binary
TensorFlow version (use command below): 2.4.1
Python version: 3.6.9
CUDA/cuDNN version: 11.1
GPU model and memory: 12 GB

Describe the current behavior

When I load the frozen graph using tf.saved_model.load("saved_model.pb"), the signature shows all the tensor names, but when .local/lib/python3.6/site-packages/tensorflow/python/tools/saved_model_aot_compile.py does the _prune_removed_feed_nodes(signature_def, graph_def), it prunes multiple tensors that it does not find in graph_dev.node and as a result my final deployed model is incorrect:

def _prune_removed_feed_nodes(signature_def, graph_def):
  """Identify the inputs in the signature no longer in graph_def, prune them.

  Args:
    signature_def: A `SignatureDef` instance.
    graph_def: A `GraphDef` instance.

  Returns:
    A new pruned `SignatureDef`.
  """
  node_names = set([n.name for n in graph_def.node])
  new_signature_def = meta_graph_pb2.SignatureDef()
  new_signature_def.CopyFrom(signature_def)
  for (k, v) in signature_def.inputs.items():
    tensor_name, _ = _parse_tensor_name(v.name)
    if tensor_name not in node_names:
    ¦ logging.warn(
    ¦   ¦ 'Signature input key \'{}\', tensor name \'{}\', has been pruned '
    ¦   ¦ 'while freezing the graph.  Removing it from the compiled signatures.'
    ¦   ¦ .format(k, tensor_name))
    ¦ del new_signature_def.inputs[k]
  return new_signature_def

Here are my other TF related packages:

tensorboard             2.6.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit  1.7.0
tensorflow              2.4.1
tensorflow-addons       0.11.2
tensorflow-estimator    2.4.0
tensorflow-probability  0.12.2
tf-agents               0.7.1
tf-estimator-nightly    2.4.0.dev2020102201

Also, 2.4.1 was used to create the model:

>>> imported
<tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject object at 0x7f6bcc4f7a58>
>>> imported.tensorflow_version
'2.4.1'

Here is the debug output:

coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 511.41GiB/s
2022-02-06 21:56:34.286796: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-02-06 21:56:34.290840: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-02-06 21:56:34.290891: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-02-06 21:56:34.293434: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-02-06 21:56:34.293855: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-02-06 21:56:34.296799: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-02-06 21:56:34.297732: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-02-06 21:56:34.297943: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-02-06 21:56:34.299759: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-02-06 21:56:34.299802: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-02-06 21:56:35.075324: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-02-06 21:56:35.075387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0
2022-02-06 21:56:35.075400: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N
2022-02-06 21:56:35.078385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11119 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-12GB, pci bus id: $
000:82:00.0, compute capability: 6.0)
2022-02-06 21:56:35.096700: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 2194810000 Hz
2022-02-06 21:56:35.236522: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:592] model_pruner failed: Invalid argument: Graph does not contain terminal node StatefulPartitionedCall_2.
2022-02-06 21:56:35.247615: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:928] Optimization results for grappler item: graph_to_optimize
  model_pruner: Graph size after: 38 nodes (-2), 48 edges (0), time = 0.987ms.
  implementation_selector: Graph size after: 38 nodes (0), 48 edges (0), time = 0.562ms.
  function_optimizer: Graph size after: 343 nodes (305), 581 edges (533), time = 22.39ms.
  common_subgraph_elimination: Graph size after: 303 nodes (-40), 541 edges (-40), time = 3.508ms.
  constant_folding: Graph size after: 227 nodes (-76), 387 edges (-154), time = 55.232ms.
  shape_optimizer: shape_optimizer did nothing. time = 0.41ms.
  arithmetic_optimizer: Graph size after: 238 nodes (11), 398 edges (11), time = 4.174ms.
  layout: Graph size after: 238 nodes (0), 398 edges (0), time = 5.838ms.
  remapper: Graph size after: 238 nodes (0), 398 edges (0), time = 1.459ms.
  loop_optimizer: Graph size after: 238 nodes (0), 397 edges (-1), time = 1.714ms.
  dependency_optimizer: Graph size after: 156 nodes (-82), 221 edges (-176), time = 3.391ms.
  memory_optimizer: Graph size after: 156 nodes (0), 221 edges (0), time = 6.835ms.
  model_pruner: Invalid argument: Graph does not contain terminal node StatefulPartitionedCall_2.
  implementation_selector: Graph size after: 156 nodes (0), 221 edges (0), time = 0.468ms.
  function_optimizer: function_optimizer did nothing. time = 0.127ms.
  common_subgraph_elimination: Graph size after: 146 nodes (-10), 211 edges (-10), time = 1.021ms.
  constant_folding: Graph size after: 146 nodes (0), 211 edges (0), time = 3.151ms.
  shape_optimizer: shape_optimizer did nothing. time = 0.133ms.
  arithmetic_optimizer: Graph size after: 146 nodes (0), 211 edges (0), time = 2.64ms.
  remapper: Graph size after: 146 nodes (0), 211 edges (0), time = 0.8ms.
  dependency_optimizer: Graph size after: 146 nodes (0), 211 edges (0), time = 1.752ms.

2022-02-06 21:56:35.281080: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-02-06 21:56:35.282047: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:82:00.0 name: Tesla P100-PCIE-12GB computeCapability: 6.0
coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 511.41GiB/s
2022-02-06 21:56:35.282083: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-02-06 21:56:35.282137: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-02-06 21:56:35.282155: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-02-06 21:56:35.282172: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-02-06 21:56:35.282190: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-02-06 21:56:35.282208: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-02-06 21:56:35.282226: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-02-06 21:56:35.282243: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-02-06 21:56:35.283971: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-02-06 21:56:35.284299: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2022-02-06 21:56:35.285214: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties:
pciBusID: 0000:82:00.0 name: Tesla P100-PCIE-12GB computeCapability: 6.0
coreClock: 1.3285GHz coreCount: 56 deviceMemorySize: 11.91GiB deviceMemoryBandwidth: 511.41GiB/s
2022-02-06 21:56:35.285237: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2022-02-06 21:56:35.285258: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2022-02-06 21:56:35.285277: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2022-02-06 21:56:35.285295: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2022-02-06 21:56:35.285311: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2022-02-06 21:56:35.285328: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2022-02-06 21:56:35.285345: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11
2022-02-06 21:56:35.285363: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2022-02-06 21:56:35.287113: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2022-02-06 21:56:35.287143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2022-02-06 21:56:35.287153: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0
2022-02-06 21:56:35.287161: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N
2022-02-06 21:56:35.288959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 11119 MB memory) -> physical GPU (device: 0, name: Tesla P100-PCIE-12GB, pci bus id: 0000:82:00.0, compute capability: 6.0)
INFO:tensorflow:Restoring parameters from /home/llvm-project/llvm/lib/Analysis/models/inliner/variables/variables
2022-02-06 21:56:35.354948: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:196] None of the MLIR optimization passes are enabled (registered 0 passes)
WARNING:tensorflow:From /home/.local/lib/python3.6/site-packages/tensorflow/python/tools/saved_model_aot_compile.py:332: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.convert_variables_to_constants`
WARNING:tensorflow:From /home/.local/lib/python3.6/site-packages/tensorflow/python/framework/convert_to_constants.py:856: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.compat.v1.graph_util.extract_sub_graph`
WARNING:tensorflow:Signature input key 'XXX', tensor name 'action_XXX', has been pruned while freezing the graph.  Removing it from the compiled signatures.
WARNING:tensorflow:Signature input key 'discount', tensor name 'action_discount', has been pruned while freezing the graph.  Removing it from the compiled signatures.
WARNING:tensorflow:Signature input key 'XXX', tensor name 'action_XXX', has been pruned while freezing the graph.  Removing it from the compiled signatures.
WARNING:tensorflow:Signature input key 'reward', tensor name 'action_reward', has been pruned while freezing the graph.  Removing it from the compiled signatures.
WARNING:tensorflow:Signature input key 'step_type', tensor name 'action_step_type', has been pruned while freezing the graph.  Removing it from the compiled signatures.
WARNING:tensorflow:Signature input key 'inlining_default', tensor name 'action_inlining_default', has been pruned while freezing the graph.  Removing it from the compiled signatures.
INFO:tensorflow:Writing graph def to: /tmp/saved_model_clilxj7nh6h/frozen_graph.pb
INFO:tensorflow:Writing config_pbtxt to: /tmp/saved_model_clilxj7nh6h/config.pbtxt
INFO:tensorflow:Generating XLA AOT artifacts in: /home/llvm-project/build/lib/Analysis

The text was updated successfully, but these errors were encountered:

tilakrayal · 2022-02-08T12:31:15Z

@amirjamez ,
In order to expedite the trouble-shooting process, could you please provide the complete code and dataset to reproduce the issue reported here.

amirjamez · 2022-02-10T05:46:38Z

@tilakrayal Received an answer from the group they are working on this project: google/ml-compiler-opt#14

google-ml-butler · 2022-02-10T05:46:40Z

Are you satisfied with the resolution of your issue?
Yes
No

amirjamez added the type:bug Bug label Feb 8, 2022

google-ml-butler bot assigned tilakrayal Feb 8, 2022

amirjamez mentioned this issue Feb 8, 2022

Adding new features under InlineModelFeatureMaps.h results in TF model pruner to remove them at deployment google/ml-compiler-opt#14

Closed

tilakrayal added the TF 2.4 for issues related to TF 2.4 label Feb 8, 2022

amirjamez closed this as completed Feb 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

saved_model_aot_compile.py removes unwanted tensors from signature #54296

saved_model_aot_compile.py removes unwanted tensors from signature #54296

amirjamez commented Feb 8, 2022 •

edited

tilakrayal commented Feb 8, 2022

amirjamez commented Feb 10, 2022

google-ml-butler bot commented Feb 10, 2022

saved_model_aot_compile.py removes unwanted tensors from signature #54296

saved_model_aot_compile.py removes unwanted tensors from signature #54296

Comments

amirjamez commented Feb 8, 2022 • edited

tilakrayal commented Feb 8, 2022

amirjamez commented Feb 10, 2022

google-ml-butler bot commented Feb 10, 2022

amirjamez commented Feb 8, 2022 •

edited