diff --git a/docs/source/api-documentation/adaptor/onnxrt.rst b/docs/source/api-documentation/adaptor/onnxrt.rst index 2a44557b9cd..312103aaea5 100644 --- a/docs/source/api-documentation/adaptor/onnxrt.rst +++ b/docs/source/api-documentation/adaptor/onnxrt.rst @@ -1,4 +1,4 @@ -ONNXRT +ONNX Runtime ============== .. autoapisummary:: diff --git a/docs/source/api-documentation/apis.rst b/docs/source/api-documentation/apis.rst index 2028c25edb9..0e2765e156d 100644 --- a/docs/source/api-documentation/apis.rst +++ b/docs/source/api-documentation/apis.rst @@ -6,7 +6,7 @@ The following API information is available: .. toctree:: :maxdepth: 1 - new_api + basic_api adaptor strategy model diff --git a/docs/source/api-documentation/basic_api.rst b/docs/source/api-documentation/basic_api.rst new file mode 100644 index 00000000000..05116644e14 --- /dev/null +++ b/docs/source/api-documentation/basic_api.rst @@ -0,0 +1,14 @@ +User facing APIs +########### + +The user facing APIs information is available: + +.. toctree:: + :maxdepth: 1 + + basic_api/quantization + basic_api/mix_precision + basic_api/benchmark + basic_api/objective + basic_api/training + basic_api/config \ No newline at end of file diff --git a/docs/source/api-documentation/new_api/benchmark.rst b/docs/source/api-documentation/basic_api/benchmark.rst similarity index 100% rename from docs/source/api-documentation/new_api/benchmark.rst rename to docs/source/api-documentation/basic_api/benchmark.rst diff --git a/docs/source/api-documentation/new_api/config.rst b/docs/source/api-documentation/basic_api/config.rst similarity index 100% rename from docs/source/api-documentation/new_api/config.rst rename to docs/source/api-documentation/basic_api/config.rst diff --git a/docs/source/api-documentation/new_api/mix_precision.rst b/docs/source/api-documentation/basic_api/mix_precision.rst similarity index 100% rename from docs/source/api-documentation/new_api/mix_precision.rst rename to docs/source/api-documentation/basic_api/mix_precision.rst diff --git a/docs/source/api-documentation/new_api/objective.rst b/docs/source/api-documentation/basic_api/objective.rst similarity index 100% rename from docs/source/api-documentation/new_api/objective.rst rename to docs/source/api-documentation/basic_api/objective.rst diff --git a/docs/source/api-documentation/new_api/quantization.rst b/docs/source/api-documentation/basic_api/quantization.rst similarity index 100% rename from docs/source/api-documentation/new_api/quantization.rst rename to docs/source/api-documentation/basic_api/quantization.rst diff --git a/docs/source/api-documentation/new_api/training.rst b/docs/source/api-documentation/basic_api/training.rst similarity index 100% rename from docs/source/api-documentation/new_api/training.rst rename to docs/source/api-documentation/basic_api/training.rst diff --git a/docs/source/api-documentation/new_api.rst b/docs/source/api-documentation/new_api.rst deleted file mode 100644 index be3bc874733..00000000000 --- a/docs/source/api-documentation/new_api.rst +++ /dev/null @@ -1,14 +0,0 @@ -New user facing APIs -########### - -The new user facing APIs information is available: - -.. toctree:: - :maxdepth: 1 - - new_api/quantization - new_api/mix_precision - new_api/benchmark - new_api/objective - new_api/training - new_api/config \ No newline at end of file diff --git a/examples/README.md b/examples/README.md index 51dc4cba240..98cbc12f84d 100644 --- a/examples/README.md +++ b/examples/README.md @@ -535,7 +535,7 @@ IntelĀ® Neural Compressor validated examples with multiple compression technique SD Diffusion Text to Image Post-Training Static Quantization - fx + fx @@ -1109,7 +1109,7 @@ IntelĀ® Neural Compressor validated examples with multiple compression technique Emotion FERPlus Body Analysis Post-Training Static Quantization - qlinearops + qlinearops Ultra Face diff --git a/examples/onnxrt/README.md b/examples/onnxrt/README.md deleted file mode 100644 index 90e43865e5e..00000000000 --- a/examples/onnxrt/README.md +++ /dev/null @@ -1,100 +0,0 @@ -ONNX model quantization -====== - -Currently Neural Compressor supports dynamic and static quantization for onnx models. - -## Dynamic quantization - -Dynamic quantization calculates the quantization parameter (scale and zero point) for activations dynamically. - - -### How to use - -Users can use dynamic quantization method by these ways: - -1. Write configuration in yaml file - -```yaml -model: - name: model_name - framework: onnxrt_integerops - -quantization: - approach: post_training_dynamic_quant - -# other items are omitted -``` - -2. Write configuration with python code - -```python -from neural_compressor import conf -from neural_compressor.experimental import Quantization -conf.model.framework = 'onnxrt_integerops' -conf.quantization.approach = 'post_training_dynamic_quant' - -quantizer = Quantization(conf) -``` - -## Static quantization - -Static quantization leverages the calibration data to calculates the quantization parameter of activations. There are two ways to represent quantized ONNX models: operator oriented with QLinearOps and tensor oriented (QDQ format). - -### How to use - -#### Operator oriented with QLinearOps - -Users can quantize ONNX models with QLinearOps by these ways: - -1. Write configuration in yaml file - -```yaml -model: - name: model_name - framework: onnxrt_qlinearops - -quantization: - approach: post_training_static_quant - -# other items are omitted -``` - -2. Write configuration with python code - -```python -from neural_compressor import conf -from neural_compressor.experimental import Quantization -conf.model.framework = 'onnxrt_qlinearops' -conf.quantization.approach = 'post_training_static_quant' - -quantizer = Quantization(conf) -``` - -#### Tensor oriented (QDQ format) - -Users can quantize ONNX models with QDQ format by these ways: - -1. Write configuration in yaml file - -```yaml -model: - name: model_name - framework: onnxrt_qdqops - -quantization: - approach: post_training_static_quant - -# other items are omitted -``` - -2. Write configuration with python code - -```python -from neural_compressor import conf -from neural_compressor.experimental import Quantization -conf.model.framework = 'onnxrt_qdqops' -conf.quantization.approach = 'post_training_static_quant' - -quantizer = Quantization(conf) -``` - diff --git a/examples/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq/README.md b/examples/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq/README.md index 98391e58e86..94e87cbd028 100644 --- a/examples/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq/README.md +++ b/examples/onnxrt/body_analysis/onnx_model_zoo/arcface/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load a face recognition model from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [Refined MS-Celeb-1M](https://s3.amazonaws.com/onnx-model-zoo/arcface/dataset/faces_ms1m_112x112.zip). You need to download this dataset yourself. ### Environment -onnx: 1.11.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/body_analysis/onnx_model_zoo/emotion_ferplus/quantization/ptq/README.md b/examples/onnxrt/body_analysis/onnx_model_zoo/emotion_ferplus/quantization/ptq/README.md index 76a7fbc031b..f0438fa6bf2 100644 --- a/examples/onnxrt/body_analysis/onnx_model_zoo/emotion_ferplus/quantization/ptq/README.md +++ b/examples/onnxrt/body_analysis/onnx_model_zoo/emotion_ferplus/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load an model converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [Emotion FER dataset](https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data). You need to download this dataset yourself. ### Environment -onnx: 1.11.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq/README.md b/examples/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq/README.md index 38e1880112e..f3461280171 100644 --- a/examples/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq/README.md +++ b/examples/onnxrt/body_analysis/onnx_model_zoo/ultraface/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load an model converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [WIDER FACE dataset (Validation Images)](http://shuoyang1213.me/WIDERFACE/). You need to download this dataset yourself. ### Environment -onnx: 1.11.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/image_recognition/onnx_model_zoo/fcn/quantization/ptq/README.md b/examples/onnxrt/image_recognition/onnx_model_zoo/fcn/quantization/ptq/README.md index 52dedf3882a..64463aaa3be 100644 --- a/examples/onnxrt/image_recognition/onnx_model_zoo/fcn/quantization/ptq/README.md +++ b/examples/onnxrt/image_recognition/onnx_model_zoo/fcn/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load an object detection model converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [MS COCO 2017 dataset](https://cocodataset.org/#download). You need to download this dataset yourself. ### Environment -onnx: 1.9.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/image_recognition/onnx_model_zoo/squeezenet/quantization/ptq/README.md b/examples/onnxrt/image_recognition/onnx_model_zoo/squeezenet/quantization/ptq/README.md index 6a61b3816c6..65e34145779 100644 --- a/examples/onnxrt/image_recognition/onnx_model_zoo/squeezenet/quantization/ptq/README.md +++ b/examples/onnxrt/image_recognition/onnx_model_zoo/squeezenet/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load an image classification model from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [ILSVR2012 validation Imagenet dataset](http://www.image-net.org/challenges/LSVRC/2012/downloads). You need to download this dataset yourself. ### Environment -onnx: 1.9.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/image_recognition/unet/quantization/ptq/readme.md b/examples/onnxrt/image_recognition/unet/quantization/ptq/readme.md index 4c28538295a..a307318d318 100644 --- a/examples/onnxrt/image_recognition/unet/quantization/ptq/readme.md +++ b/examples/onnxrt/image_recognition/unet/quantization/ptq/readme.md @@ -3,8 +3,9 @@ This is an experimental example to quantize unet model. We use dummy data to do quantization and evaluation, so the accuracy is not guaranteed. ### Environment -onnx: 1.12.0 -onnxruntime: 1.12.1 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model diff --git a/examples/onnxrt/nlp/onnx_model_zoo/BiDAF/quantization/ptq/README.md b/examples/onnxrt/nlp/onnx_model_zoo/BiDAF/quantization/ptq/README.md index aa6ffab30ba..d6ddbe83db5 100644 --- a/examples/onnxrt/nlp/onnx_model_zoo/BiDAF/quantization/ptq/README.md +++ b/examples/onnxrt/nlp/onnx_model_zoo/BiDAF/quantization/ptq/README.md @@ -1,10 +1,11 @@ # Evaluate performance of ONNX Runtime(BiDAF) -This example load a a neural network for answering a query about a given context paragraph. It is converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/). +This example loads a neural network for answering a query about a given context paragraph. It is converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [SQuAD v1.1](https://rajpurkar.github.io/SQuAD-explorer/explore/1.1/dev/). ### Environment -onnx: 1.11.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models) diff --git a/examples/onnxrt/nlp/onnx_model_zoo/bert-squad/quantization/ptq/readme.md b/examples/onnxrt/nlp/onnx_model_zoo/bert-squad/quantization/ptq/readme.md index 5b3dfaa8879..d2d4285b897 100644 --- a/examples/onnxrt/nlp/onnx_model_zoo/bert-squad/quantization/ptq/readme.md +++ b/examples/onnxrt/nlp/onnx_model_zoo/bert-squad/quantization/ptq/readme.md @@ -4,8 +4,9 @@ This example load a language translation model and confirm its accuracy and speed based on [SQuAD]((https://rajpurkar.github.io/SQuAD-explorer/)) task. ### Environment -onnx: 1.9.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare dataset You should download SQuAD dataset from [SQuAD dataset link](https://rajpurkar.github.io/SQuAD-explorer/). diff --git a/examples/onnxrt/nlp/onnx_model_zoo/gpt2/quantization/ptq/readme.md b/examples/onnxrt/nlp/onnx_model_zoo/gpt2/quantization/ptq/readme.md index 4aa47acaf9e..dc6a8b4933e 100644 --- a/examples/onnxrt/nlp/onnx_model_zoo/gpt2/quantization/ptq/readme.md +++ b/examples/onnxrt/nlp/onnx_model_zoo/gpt2/quantization/ptq/readme.md @@ -4,9 +4,10 @@ This example load a language translation model and confirm its accuracy and speed based on [WikiText](https://blog.einstein.ai/the-wikitext-long-term-dependency-language-modeling-dataset/) dataset. ### Environment -onnx: 1.7.0 -onnxruntime: 1.8.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 transformers: 3.2.0 +> Validated framework versions can be found in main readme. ### Prepare dataset Please download [WikiText-2 dataset](https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip). diff --git a/examples/onnxrt/nlp/onnx_model_zoo/mobilebert/quantization/ptq/readme.md b/examples/onnxrt/nlp/onnx_model_zoo/mobilebert/quantization/ptq/readme.md index 648a6f411e5..84f9a566299 100644 --- a/examples/onnxrt/nlp/onnx_model_zoo/mobilebert/quantization/ptq/readme.md +++ b/examples/onnxrt/nlp/onnx_model_zoo/mobilebert/quantization/ptq/readme.md @@ -4,8 +4,9 @@ This example load a language translation model and confirm its accuracy and speed based on [SQuAD]((https://rajpurkar.github.io/SQuAD-explorer/)) task. ### Environment -onnx: 1.9.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare dataset Download pretrained bert model. We will refer to `vocab.txt` file. diff --git a/examples/onnxrt/object_detection/onnx_model_zoo/DUC/quantization/ptq/README.md b/examples/onnxrt/object_detection/onnx_model_zoo/DUC/quantization/ptq/README.md index bcba841e86f..43002a6d95d 100644 --- a/examples/onnxrt/object_detection/onnx_model_zoo/DUC/quantization/ptq/README.md +++ b/examples/onnxrt/object_detection/onnx_model_zoo/DUC/quantization/ptq/README.md @@ -4,8 +4,9 @@ This example load an object detection model converted from [ONNX Model Zoo](https://github.com/onnx/models) and confirm its accuracy and speed based on [cityscapes dataset](https://www.cityscapes-dataset.com/downloads/). You need to download this dataset yourself. ### Environment -onnx: 1.9.0 -onnxruntime: 1.10.0 +onnx: 1.12.0 +onnxruntime: 1.13.1 +> Validated framework versions can be found in main readme. ### Prepare model Download model from [ONNX Model Zoo](https://github.com/onnx/models)