Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tf.neuron.saved_model.compile fails with batch_size > 1 #60

Closed
tahouse opened this issue Jan 8, 2020 · 2 comments
Closed

tf.neuron.saved_model.compile fails with batch_size > 1 #60

tahouse opened this issue Jan 8, 2020 · 2 comments

Comments

@tahouse
Copy link

tahouse commented Jan 8, 2020

Following the tutorial here, the compilation works with batch_size=1. When I changed below to batch_size=2, the compilation fails.

import tensorflow as tf


tf.keras.backend.set_learning_phase(0)
tf.keras.backend.set_image_data_format('channels_last')
model = tf.keras.applications.ResNet50(weights='imagenet')
sess = tf.keras.backend.get_session()
inputs = {'input': model.inputs[0]}
outputs = {'output': model.outputs[0]}

# save the model using tf.saved_model.simple_save
modeldir = "./resnet50/1"
tf.saved_model.simple_save(sess, modeldir, inputs, outputs)

# compile the model for Inferentia
neuron_modeldir = "./resnet50_inf2/1"
tf.neuron.saved_model.compile(modeldir, neuron_modeldir, batch_size=2)

Output from compilation...

$ python compile.py
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
2020-01-08 20:10:40.144936: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
2020-01-08 20:10:40.150407: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3000005000 Hz
2020-01-08 20:10:40.151658: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x563e0cf6be80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-01-08 20:10:40.151678: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
WARNING:tensorflow:From compile.py:7: The name tf.keras.backend.get_session is deprecated. Please use tf.compat.v1.keras.backend.get_session instead.

WARNING:tensorflow:From compile.py:14: simple_save (from tensorflow.python.saved_model.simple_save) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.simple_save.
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/saved_model/signature_def_utils_impl.py:201: build_tensor_info (from tensorflow.python.saved_model.utils_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.utils.build_tensor_info or tf.compat.v1.saved_model.build_tensor_info.
WARNING:tensorflow:From /home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/neuron/python/saved_model.py:136: load (from tensorflow.python.saved_model.loader_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This function will only be available through the v1 compatibility library as tf.compat.v1.saved_model.loader.load or tf.compat.v1.saved_model.load. There will be a new function for importing SavedModels in Tensorflow 2.0.
INFO:tensorflow:fusing subgraph neuron_op_d6f098c01c780733 with neuron-cc
WARNING:tensorflow:Failed to fuse subgraph neuron_op_d6f098c01c780733 with '/home/ubuntu/anaconda3/envs/aws_neuron_tensorflow_p36/bin/neuron-cc compile /tmp/tmpjj5xgykv/neuron_op_d6f098c01c780733/graph_def.pb --framework TENSORFLOW --pipeline compile SaveTemps --output /tmp/tmpjj5xgykv/neuron_op_d6f098c01c780733/graph_def.neff --io-config "{\"inputs\": {\"input_10/_0:0\": [[2, 224, 224, 3], \"float32\"]}, \"outputs\": [\"probs/Softmax:0\"]}"'
INFO:tensorflow:Number of operations in TensorFlow session: 4638
INFO:tensorflow:Number of operations after tf.neuron optimizations: 555
INFO:tensorflow:Number of operations placed on Neuron runtime: 0
INFO:tensorflow:Successfully converted ./resnet50/1 to ./resnet50_inf2/1

The following are current latest versions of neuron packages running on c5.9xl with DLAMI v26 (ubuntu)

(aws_neuron_tensorflow_p36) ubuntu@ip-172-31-0-4:~$ pip list | grep neuron
neuron-cc                          1.0.5939.0+5849551057
tensorboard-neuron                 1.15.0.1.0.315.0
tensorflow-neuron                  1.15.0.1.0.803.0
You are using pip version 10.0.1, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

aws_neuron_tensorflow_p36) ubuntu@ip-172-31-0-4:~$ apt list | grep neuron

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

aws-neuron-runtime/unknown,now 1.0.4751.0 amd64 [installed]
aws-neuron-runtime-base/unknown,now 1.0.4587.0 amd64 [installed]
aws-neuron-tools/unknown,now 1.0.4587.0 amd64 [installed]
tensorflow-model-server-neuron/unknown,now 1.15.0.1.0.803.0 all [installed]

I updated following the release notes for DLAMI

#!/bin/bash

sudo apt-get update
sudo apt-get -y install aws-neuron-runtime-base
sudo apt-get -y install aws-neuron-runtime
sudo apt-get -y install aws-neuron-tools
sudo apt-get -y install tensorflow-model-server-neuron

source activate aws_neuron_tensorflow_p36
conda install numpy=1.17.2 --yes --quiet
conda update tensorflow-neuron

Is this the most up-to-date API for setting batch_size? https://github.com/aws/aws-neuron-sdk/blob/master/docs/tensorflow-neuron/api-compilation-python-api.md

@aws-taylor
Copy link
Contributor

Hello Tyler,

We have been able to reproduce this issue on our end and have opened a ticket internally. In general, we support batch sizes up to 4, however this particular network triggered an edge case in our memory allocator. We will work to fix this issue in a near release.

Regards,
Taylor

@jeffhataws
Copy link
Contributor

Current compiler limitations mean batching with FP32 has different behavior compared to batching with FP16 because it take more memory to store the input data, so only ResNet50 FP32 batch=1 is compilable. Please see https://github.com/aws/aws-neuron-sdk/blob/master/docs/technotes/performance-tuning.md for more information on how to increase batch size with FP16 graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants