doc and example update for ITEX support (#1360)

intel · Oct 29, 2022 · 6ab5570 · 6ab5570
1 parent 74b3b38
commit 6ab5570
Show file tree

Hide file tree

Showing 139 changed files with 4,376 additions and 90 deletions.
diff --git a/README.md b/README.md
@@ -17,8 +17,6 @@ Intel® Neural Compressor
 Intel® Neural Compressor, formerly known as Intel® Low Precision Optimization Tool, is an open-source Python library that runs on Intel CPUs and GPUs, which delivers unified interfaces across multiple deep-learning frameworks for popular network compression technologies such as quantization, pruning, and knowledge distillation. This tool supports automatic accuracy-driven tuning strategies to help the user quickly find out the best quantized model. It also implements different weight-pruning algorithms to generate a pruned model with predefined sparsity goal. It also supports knowledge distillation to distill the knowledge from the teacher model to the student model. 
 Intel® Neural Compressor is a critical AI software component in the [Intel® oneAPI AI Analytics Toolkit](https://software.intel.com/content/www/us/en/develop/tools/oneapi/ai-analytics-toolkit.html).
 
-> **Note:**
-> GPU support is under development.
 
 **Visit the Intel® Neural Compressor online document website at: <https://intel.github.io/neural-compressor>.**   
 
@@ -107,6 +105,10 @@ Intel® Neural Compressor supports systems based on [Intel 64 architecture or co
 * Intel Xeon Scalable processor (formerly Skylake, Cascade Lake, Cooper Lake, and Icelake)
 * Future Intel Xeon Scalable processor (code name Sapphire Rapids)
 
+Intel® Neural Compressor supports the following Intel GPUs built on Intel's Xe architecture:
+
+* [Intel® Data Center GPU Flex Series](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/data-center-gpu/flex-series/overview.html)
+
 ### Validated Software Environment
 
 * OS version: CentOS 8.4, Ubuntu 20.04  

diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/README.md b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/README.md
@@ -2,6 +2,7 @@ Step-by-Step
 ============
 
 This document is used to enable Tensorflow SavedModel format using Intel® Neural Compressor for performance only.
+This example can run on Intel CPUs and GPUs.
 
 
 ## Prerequisite
@@ -17,14 +18,32 @@ pip install intel-tensorflow
 ```
 > Note: Supported Tensorflow >= 2.4.0.
 
-### 3. Prepare Pretrained model
+### 3. Install Intel Extension for Tensorflow if needed
+#### Tuning the model on Intel GPU(Mandatory)
+Intel Extension for Tensorflow is mandatory to be installed for tuning the model on Intel GPUs.
+
+```shell
+pip install --upgrade intel-extension-for-tensorflow[gpu]
+```
+For any more details, please follow the procedure in [install-gpu-drivers](https://github.com/intel-innersource/frameworks.ai.infrastructure.intel-extension-for-tensorflow.intel-extension-for-tensorflow/blob/master/docs/install/install_for_gpu.md#install-gpu-drivers)
+
+#### Tuning the model on Intel CPU(Experimental)
+Intel Extension for Tensorflow for Intel CPUs is experimental currently. It's not mandatory for tuning the model on Intel CPUs.
+
+```shell
+pip install --upgrade intel-extension-for-tensorflow[cpu]
+```
+
+### 4. Prepare Pretrained model
 Download the model from tensorflow-hub.
 
 image recognition
 - [mobilenetv1](https://hub.tensorflow.google.cn/google/imagenet/mobilenet_v1_075_224/classification/5)
 - [mobilenetv2](https://hub.tensorflow.google.cn/google/imagenet/mobilenet_v2_035_224/classification/5)
 - [efficientnet_v2_b0](https://hub.tensorflow.google.cn/google/imagenet/efficientnet_v2_imagenet1k_b0/classification/2)
 
+## Write Yaml config file
+In examples directory, there are mobilenet_v1.yaml, mobilenet_v2.yaml and efficientnet_v2_b0.yaml for tuning the model on Intel CPUs. The 'framework' in the yaml is set to 'tensorflow'. If running this example on Intel GPUs, the 'framework' should be set to 'tensorflow_itex' and the device in yaml file should be set to 'gpu'. The mobilenet_v1_itex.yaml, mobilenet_v2_itex.yaml and efficientnet_v2_b0_itex.yaml are prepared for the GPU case. We could remove most of items and only keep mandatory item for tuning. We also implement a calibration dataloader and have evaluation field for creation of evaluation function at internal neural_compressor.
 
 ## Run Command
   ```shell

diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/efficientnet_v2_b0.yaml b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/efficientnet_v2_b0.yaml
@@ -19,6 +19,8 @@ model:                                               # mandatory. neural_compres
   name: efficientnet_v2_b0
   framework: tensorflow                              # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
 
+device: cpu                                          # optional. default value is cpu, other value is gpu.
+
 quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
   calibration:
     sampling_size: 5, 10, 50, 100                    # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.

diff --git a/...les/tensorflow/image_recognition/SavedModel/quantization/ptq/efficientnet_v2_b0_itex.yaml b/...les/tensorflow/image_recognition/SavedModel/quantization/ptq/efficientnet_v2_b0_itex.yaml
@@ -0,0 +1,89 @@
+#
+# Copyright (c) 2021 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+version: 1.0
+
+model:                                               # mandatory. neural_compressor uses this model name and framework name to decide where to save tuning history and deploy yaml.
+  name: efficientnet_v2_b0
+  framework: tensorflow_itex                         # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
+
+device: gpu                                          # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].
+quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+  calibration:
+    sampling_size: 5, 10, 50, 100                    # optional. default value is the size of whole dataset. used to set how many portions of calibration dataset is used. exclusive with iterations field.
+    dataloader:
+      dataset:
+        ImagenetRaw:
+          data_path: /path/to/calibration/dataset     # NOTE: modify to calibration dataset location if needed
+          image_list: /path/to/calibration/label      # data file, record image_names and their labels
+      transform:
+        PaddedCenterCrop:
+          size: 224
+          crop_padding: 32
+        Resize:
+          size: 224
+          interpolation: bicubic
+        Normalize:
+          mean: [123.675, 116.28, 103.53]
+          std: [58.395, 57.12, 57.375]
+
+evaluation:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+  accuracy:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+    metric:
+      topk: 1                                        # built-in metrics are topk, map, f1, allow user to register new metric.
+    dataloader:
+      batch_size: 32
+      dataset:
+        ImagenetRaw:
+          data_path: /path/to/evaluation/dataset     # NOTE: modify to evaluation dataset location if needed
+          image_list: /path/to/evaluation/label      # data file, record image_names and their labels
+      transform:
+        PaddedCenterCrop:
+          size: 224
+          crop_padding: 32
+        Resize:
+          size: 224
+          interpolation: bicubic
+        Normalize:
+          mean: [123.675, 116.28, 103.53]
+          std: [58.395, 57.12, 57.375]
+  performance:                                       # optional. used to benchmark performance of passing model.
+    iteration: 100
+    configs:
+      cores_per_instance: 4
+      num_of_instance: 7
+    dataloader:
+      batch_size: 1
+      dataset:
+        ImagenetRaw:
+          data_path: /path/to/evaluation/dataset     # NOTE: modify to evaluation dataset location if needed
+          image_list: /path/to/evaluation/label      # data file, record image_names and their labels
+      transform:
+        PaddedCenterCrop:
+          size: 224
+          crop_padding: 32
+        Resize:
+          size: 224
+          interpolation: bicubic
+        Normalize:
+          mean: [123.675, 116.28, 103.53]
+          std: [58.395, 57.12, 57.375]
+
+tuning:
+  accuracy_criterion:
+    relative:  0.01                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
+  exit_policy:
+    timeout: 0                                       # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
+  random_seed: 9527                                  # optional. random seed for deterministic tuning.
diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v1.yaml b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v1.yaml
@@ -17,6 +17,8 @@ model:                                               # mandatory. used to specif
   name: mobilenet_v1
   framework: tensorflow                              # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
 
+device: cpu                                          # optional. default value is cpu, other value is gpu.
+
 quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
   calibration:
     sampling_size: 20, 50                            # optional. default value is 100. used to set how many samples should be used in calibration.

diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v1_itex.yaml b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v1_itex.yaml
@@ -0,0 +1,73 @@
+#
+# Copyright (c) 2021 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+model:                                               # mandatory. used to specify model specific information.
+  name: mobilenet_v1
+  framework: tensorflow_itex                         # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
+
+device: gpu                                          # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].
+
+quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+  calibration:
+    sampling_size: 20, 50                            # optional. default value is 100. used to set how many samples should be used in calibration.
+    dataloader:
+      batch_size: 10
+      dataset:
+        ImageRecord:
+          root: /path/to/calibration/dataset         # NOTE: modify to calibration dataset location if needed
+      transform:
+        BilinearImagenet: 
+          height: 224
+          width: 224
+  model_wise:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+    activation:
+      algorithm: minmax
+    weight:
+      granularity: per_channel
+
+evaluation:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+  accuracy:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+    metric:
+      topk: 1                                        # built-in metrics are topk, map, f1, allow user to register new metric.
+    dataloader:
+      batch_size: 32
+      dataset:
+        ImageRecord:
+          root: /path/to/evaluation/dataset          # NOTE: modify to evaluation dataset location if needed
+      transform:
+        BilinearImagenet: 
+          height: 224
+          width: 224
+  performance:                                       # optional. used to benchmark performance of passing model.
+    iteration: 100
+    configs:
+      cores_per_instance: 4
+      num_of_instance: 7
+    dataloader:
+      batch_size: 1
+      dataset:
+        ImageRecord:
+          root: /path/to/evaluation/dataset          # NOTE: modify to evaluation dataset location if needed
+      transform:
+        BilinearImagenet:
+          height: 224
+          width: 224
+
+tuning:
+  accuracy_criterion:
+    relative:  0.01                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
+  exit_policy:
+    timeout: 0                                       # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
+  random_seed: 9527                                  # optional. random seed for deterministic tuning.
diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v2.yaml b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v2.yaml
@@ -17,6 +17,8 @@ model:                                               # mandatory. used to specif
   name: mobilenet_v2
   framework: tensorflow                              # mandatory. supported values are tensorflow, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
 
+device: cpu                                          # optional. default value is cpu, other value is gpu.
+
 quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
   calibration:
     sampling_size: 20, 50                            # optional. default value is 100. used to set how many samples should be used in calibration.

diff --git a/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v2_itex.yaml b/examples/tensorflow/image_recognition/SavedModel/quantization/ptq/mobilenet_v2_itex.yaml
@@ -0,0 +1,82 @@
+#
+# Copyright (c) 2021 Intel Corporation
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+model:                                               # mandatory. used to specify model specific information.
+  name: mobilenet_v2
+  framework: tensorflow_itex                         # mandatory. supported values are tensorflow, tensorflow_itex, pytorch, pytorch_ipex, onnxrt_integer, onnxrt_qlinear or mxnet; allow new framework backend extension.
+
+device: gpu                                          # optional. set cpu if installed intel-extension-for-tensorflow[cpu], set gpu if installed intel-extension-for-tensorflow[gpu].
+
+quantization:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+  calibration:
+    sampling_size: 20, 50                            # optional. default value is 100. used to set how many samples should be used in calibration.
+    dataloader:
+      batch_size: 10
+      dataset:
+        ImageRecord:
+          root: /path/to/calibration/dataset         # NOTE: modify to calibration dataset location if needed
+      transform:
+        BilinearImagenet: 
+          height: 224
+          width: 224
+  model_wise:                                        # optional. tuning constraints on model-wise for advance user to reduce tuning space.
+    activation:
+      algorithm: minmax
+    weight:
+      granularity: per_channel
+
+  op_wise: {
+             'MobilenetV2/expanded_conv/depthwise/depthwise': {
+               'activation':  {'dtype': ['fp32']},
+             },
+             'MobilenetV2/Conv_1/Conv2D': {
+               'activation':  {'dtype': ['fp32']},
+             }
+           }
+
+evaluation:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+  accuracy:                                          # optional. required if user doesn't provide eval_func in neural_compressor.Quantization.
+    metric:
+      topk: 1                                        # built-in metrics are topk, map, f1, allow user to register new metric.
+    dataloader:
+      batch_size: 32
+      dataset:
+        ImageRecord:
+          root: /path/to/evaluation/dataset          # NOTE: modify to evaluation dataset location if needed
+      transform:
+        BilinearImagenet: 
+          height: 224
+          width: 224
+  performance:                                       # optional. used to benchmark performance of passing model.
+    iteration: 100
+    configs:
+      cores_per_instance: 4
+      num_of_instance: 7
+    dataloader:
+      batch_size: 1
+      dataset:
+        ImageRecord:
+          root: /path/to/evaluation/dataset          # NOTE: modify to evaluation dataset location if needed
+      transform:
+        BilinearImagenet:
+          height: 224
+          width: 224
+
+tuning:
+  accuracy_criterion:
+    relative:  0.01                                  # optional. default value is relative, other value is absolute. this example allows relative accuracy loss: 1%.
+  exit_policy:
+    timeout: 0                                       # optional. tuning timeout (seconds). default value is 0 which means early stop. combine with max_trials field to decide when to exit.
+  random_seed: 9527                                  # optional. random seed for deterministic tuning.