TensorFlow v2.5 provides a PluggableDevice mechanism that enables modular, plug-and-play integration of device-specific code. AMD adopted PluggableDevice when implementing the TensorFlow-ZenDNN plug-in for AMD EPYCTM CPUs and released with the TensorFlow 2.12 (See the blog announcement). TensorFlow-ZenDNN plug-in adds custom kernel implementations and operations specific to AMD EPYCTM CPUs to TensorFlow through its kernel and op registration C APIs.
TensorFlow-ZenDNN plug-in is a supplemental package to be installed alongside standard TensorFlow packages starting from TF version 2.12 onwards. From a TensorFlow developer’s perspective, the TensorFlow-ZenDNN plug-in approach simplifies the process of leveraging ZenDNN optimizations.
The following is a high-level block diagram for the TensorFlow-ZenDNN plug-in package which utilizes ZenDNN as the core inference library:
This file shows how to implement, build, install and run a TensorFlow-ZenDNN plug-in for AMD CPUs.
- Linux
Tools/Frameworks | Version |
---|---|
Bazel | >=3.1 |
Git | >=1.8 |
Python | >=3.9 and <=3.11 |
TensorFlow | >=2.12 |
Note: Make sure you have active conda environment and TensorFlow v2.14 installed.
1. Download the wheel file from here.
$ pip install tensorflow_zendnn_plugin-0.2.0-cp39-cp39-linux_x86_64.whl
$ git clone https://github.com/amd/ZenDNN-tensorflow-plugin.git
$ cd ZenDNN-tensorflow-plugin/
Note: Configure & Build Tensorflow-ZenDNN Plug-in manually by following the steps [3-6].
The setup script will configure & build and install Tensorflow-ZenDNN Plug-in. It will also set the necessary environment variables of ZenDNN execution. However, these variables should be verified empirically.
ZenDNN-tensorflow-plugin$ source scripts/TensorFlow_ZenDNN_plugin_setup.sh
ZenDNN-tensorflow-plugin$ pip install -r requirements.txt
ZenDNN-tensorflow-plugin$ ./configure
You have bazel 5.3.0 installed.
Please specify the location of python. [Default is /home/user/anaconda3/envs/tf-2.14-zendnn-plugin-env/bin/python]:
Found possible Python library paths:
/home/user/anaconda3/envs/tf-2.14-zendnn-plugin-env/lib/python3.9/site-packages
Please input the desired Python library path to use. Default is [/home/user/anaconda3/envs/tf-2.14-zendnn-plugin-env/lib/python3.9/site-packages]
Do you wish to build TensorFlow plug-in with MPI support? [y/N]:
No MPI support will be enabled for TensorFlow plug-in.
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Configuration finished
ZenDNN-tensorflow-plugin$ bazel build -c opt //tensorflow_plugin/tools/pip_package:build_pip_package --verbose_failures --spawn_strategy=standalone
ZenDNN-tensorflow-plugin$ bazel-bin/tensorflow_plugin/tools/pip_package/build_pip_package .
Note: It will generate and save python wheel file for TensorFlow-ZenDNN Plug-in into the current directory (i.e., ZenDNN-tensorflow-plugin/).
ZenDNN-tensorflow-plugin$ pip install tensorflow_zendnn_plugin-0.2.0-cp39-cp39-linux_x86_64.whl
The build and installation from source is done!
$ export TF_ENABLE_ZENDNN_OPTS=1
$ export TF_ENABLE_ONEDNN_OPTS=0
Note: To disable ZenDNN optimizations in your inference execution, you can set the corresponding ZenDNN environment variable export TF_ENABLE_ZENDNN_OPTS=0
ZenDNN-tensorflow-plugin$ python tests/softmax.py
2023-07-13 15:10:56.178652: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-07-13 15:10:57.097190: I tensorflow/core/common_runtime/direct_session.cc:380] Device mapping: no known devices.
2023-07-13 15:10:57.097539: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:380] MLIR V1 optimization pass is not enabled
random_normal/RandomStandardNormal: (RandomStandardNormal): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098688: I tensorflow/core/common_runtime/placer.cc:114] random_normal/RandomStandardNormal: (RandomStandardNormal): /job:localhost/replica:0/task:0/device:CPU:0
random_normal/mul: (Mul): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098698: I tensorflow/core/common_runtime/placer.cc:114] random_normal/mul: (Mul): /job:localhost/replica:0/task:0/device:CPU:0
random_normal: (AddV2): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098705: I tensorflow/core/common_runtime/placer.cc:114] random_normal: (AddV2): /job:localhost/replica:0/task:0/device:CPU:0
Softmax: (Softmax): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098715: I tensorflow/core/common_runtime/placer.cc:114] Softmax: (Softmax): /job:localhost/replica:0/task:0/device:CPU:0
random_normal/shape: (Const): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098722: I tensorflow/core/common_runtime/placer.cc:114] random_normal/shape: (Const): /job:localhost/replica:0/task:0/device:CPU:0
random_normal/mean: (Const): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098728: I tensorflow/core/common_runtime/placer.cc:114] random_normal/mean: (Const): /job:localhost/replica:0/task:0/device:CPU:0
random_normal/stddev: (Const): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.098734: I tensorflow/core/common_runtime/placer.cc:114] random_normal/stddev: (Const): /job:localhost/replica:0/task:0/device:CPU:0
2023-07-13 15:10:57.125282: I tensorflow/core/util/port.cc:142] ZenDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ZENDNN_OPTS=0`.
[0.05660784 0.09040404 0.03201076 0.11204024 0.2344563 0.162052
0.09466095 0.11205972 0.0752109 0.03049729]
- TensorFlow's Pluggable Device blog
- AMD-TensorFlow blog
- Download TensorFlow-ZenDNN Plug-in binary
- TensorFlow-ZenDNN Plug-in User Guide
- As compared to the current TensorFlow-ZenDNN direct integration releases, this release provides on par performance for models, such as RefineDet, Inception, and VGG variants and sub-optimal performance for models, such as ResNet, MobileNet and EfficientNet.
- TensorFlow-ZenDNN plug-in v0.2 is supported with ZenDNN v4.1. Please see the section 2.7 of ZenDNN User Guide for performance tuning guidelines.
Note: AMD recommends using the TF-ZenDNN direct integration binaries available on the AMD ZenDNN developer resources page for optimal inference performance.
- Please email zendnnsupport@amd.com for questions, issues, and feedback.
- Please submit your questions, feature requests, and bug reports on the GitHub issues page.