diff --git a/README.md b/README.md index b90a560..9899bfa 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,5 @@
- - +
------------ @@ -13,31 +12,29 @@ MQBench is an open-source model quantization toolkit based on PyTorch fx. The envision of MQBench is to provide: -- **SOTA Algorithms**. With MQBench, the hardware vendors and researchers can benefit from the latest research progress in academia. -- **Powerful Toolkits**. With the toolkit, quantization node can be inserted to the original PyTorch module automatically with respect to the specific hardware. After training, the quantized model can be smoothly converted to the format that can inference on the real device. +- **SOTA Algorithms**. With MQBench, the hardware vendors and researchers can benefit from the latest research progress in academic. +- **Powerful Toolkits**. With the toolkit, quantization node can be inserted to the original PyTorch module automatically with respect to the specific hardware. After training, the quantized model can be smoothly converted to the format that can inference on the real device. ## Installation -``` +```shell git clone git@github.com:ModelTC/MQBench.git cd MQBench python setup.py install ``` - ## Documentation MQBench aims to support (1) various deployable quantization algorithms and (2) hardware backend libraries to facilitate the development of the community. -For the detailed information, please refer to [mqbench documentation](http://mqbench.tech/assets/docs/html/) or [latest version](https://mqbench.readthedocs.io/en/main/). - +For the detailed information, please refer to [MQBench documentation](https://mqbench.readthedocs.io/en/latest/). ## Citation If you use this toolkit or benchmark in your research, please cite this project. -``` +```latex @article{MQBench, title = {MQBench: Towards Reproducible and Deployable Model Quantization Benchmark}, author = {Yuhang Li* and Mingzhu Shen* and Jian Ma* and Yan Ren* and Mingxin Zhao* and @@ -47,7 +44,6 @@ If you use this toolkit or benchmark in your research, please cite this project. } ``` - ## License -This project is released under the [Apache 2.0 license](LICENSE). \ No newline at end of file +This project is released under the [Apache 2.0 license](LICENSE). diff --git a/docs/source/benchmark/NaturalLanguageProcessing/Benchmark.rst b/docs/source/benchmark/NaturalLanguageProcessing/Benchmark.rst index 7311451..fa7c426 100644 --- a/docs/source/benchmark/NaturalLanguageProcessing/Benchmark.rst +++ b/docs/source/benchmark/NaturalLanguageProcessing/Benchmark.rst @@ -1,4 +1,31 @@ Natural Language Processing Benchmark ===================================== +Based on MQBench, we provide a sentence classification benchmark on GLUE. +We provide the results of bert-base-uncased on 8 tasks of GLUE. -To be finished. \ No newline at end of file +Post-training Quantization +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +- Backend: Academic +- W_calibration: MinMax +- A_calibration: EMAQuantile + ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| Task | Metrics | FP32 results | int8 results | ++==================+============================+======================================+===============================+ +| **mrpc** | **acc/f1** | **87.75/91.35** | **87.75/91.2** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **mnli** | **acc m/mm** | **84.94/84.76** | **84.69/84.59** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **cola** | **Matthews corr** | **59.6** | **59.41** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **sst2** | **acc** | **93.35** | **92.78** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **stsb** | **Pearson/Spearman corr** | **89.70/89.28** | **89.36/89.22** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **qqp** | **f1/acc** | **87.82/90.91** | **87.46/90.72** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **rte** | **acc** | **72.56** | **71.84** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ +| **qnli** | **acc** | **91.84** | **91.32** | ++------------------+----------------------------+--------------------------------------+-------------------------------+ diff --git a/docs/source/get_started/index.rst b/docs/source/get_started/index.rst index 76da192..a6cc35c 100644 --- a/docs/source/get_started/index.rst +++ b/docs/source/get_started/index.rst @@ -7,6 +7,6 @@ This tutorial will give details about the whole work-through to do quantization :titlesonly: setup - quick_start_academic quick_start_deploy + quick_start_academic support_matrix \ No newline at end of file diff --git a/docs/source/get_started/quick_start_academic.rst b/docs/source/get_started/quick_start_academic.rst index 23f75f6..c528e72 100644 --- a/docs/source/get_started/quick_start_academic.rst +++ b/docs/source/get_started/quick_start_academic.rst @@ -1,15 +1,12 @@ Quick Start -- Embrace Best Research Experience ================================================= -This page is for researchers **who want to validate their marvelous quantization idea using MQBench**, -if you want to get started with deployment using MQBench, check :doc:`quick_start_deploy`. +This page is for researchers **who want to validate their marvelous quantization idea using MQBench**, if you want to get started with deployment using MQBench, check :doc:`quick_start_deploy`. -MQBench is a benchmark, a framework and a good tool for researchers. MQBench is designed easy-to-use for researchers, -for example, you can easily custom Academic Backend by provide a extra config dict to conduct any experiment. +MQBench is a benchmark, a framework and a good tool for researchers. MQBench is designed easy-to-use for researchers, for example, you can easily custom Academic Backend by providing an extra config dict to conduct any experiment. We provide step-by-step instructions and detailed comments below to help you finish deploying the **PyTorch ResNet18** model to a **Custom Academic** Backend. -Before starting, you should install MQBench first. Now we start the tour. - +Before starting, you should have done the MQBench setup in :doc:`setup`. Now we start the tour. **1**. **To begin with, let's import MQBench and prepare FP32 model.** @@ -27,7 +24,7 @@ Before starting, you should install MQBench first. Now we start the tour. **2**. **Then we learn the extra configration to custom Academic Backend.** You can also learn this section through MQBench `source code `_. -Learn which option you can choose below config through our :doc:`../user_guide/internal/learn_config` +Learn all options through our :doc:`../user_guide/internal/learn_config` .. code-block:: python @@ -52,7 +49,7 @@ Learn which option you can choose below config through our :doc:`../user_guide/i } } -**3**. **The next step prepares to conduct the experiment, take PTQ as example.** +**3**. **The next step prepares to conduct the experiment, take PTQ as the example.** .. code-block:: python @@ -73,6 +70,6 @@ Learn which option you can choose below config through our :doc:`../user_guide/i # do forward procedures ... -**You already know all basics about how to validate your marvelous quantization idea with MQBench, congratulations!** +**You have already known all basics about how to validate your marvelous quantization idea with MQBench, congratulations!** Now you can follow our advanced :doc:`user guide <../developer_guide/index>` and :doc:`developer guide <../user_guide/index>` to know more about MQBench. diff --git a/docs/source/get_started/quick_start_deploy.rst b/docs/source/get_started/quick_start_deploy.rst index 4eed855..ffcd90e 100644 --- a/docs/source/get_started/quick_start_deploy.rst +++ b/docs/source/get_started/quick_start_deploy.rst @@ -27,14 +27,30 @@ Before starting, you should install MQBench first. Now we start the tour. model = models.__dict__["resnet18"](pretrained=True) # use vision pre-defined model model.eval() -**2**. **The next step prepares to quantize the model.** +**2**. **Choose your backend.** .. code-block:: python - model = prepare_by_platform(model, BackendType.Tensorrt) #! line 1. trace model and add quant nodes for model on Tensorrt Backend - enable_calibration(model) #! line 2. turn on calibration, ready for gathering data + # backend options + backend = BackendType.Tensorrt + # backend = BackendType.SNPE + # backend = BackendType.PPLW8A16 + # backend = BackendType.NNIE + # backend = BackendType.Vitis + # backend = BackendType.ONNX_QNN + # backend = BackendType.PPLCUDA + # backend = BackendType.OPENVINO + # backend = BackendType.Tengine_u8 + # backend = BackendType.Tensorrt_NLP + +**3**. **Prepares to quantize the model.** + +.. code-block:: python + + model = prepare_by_platform(model, backend) #! line 1. trace model and add quant nodes for model on Tensorrt Backend # calibration loop + enable_calibration(model) #! line 2. turn on calibration, ready for gathering data for i, batch in enumerate(data): # do forward procedures ... @@ -46,9 +62,13 @@ Before starting, you should install MQBench first. Now we start the tour. # do forward procedures ... +**4**. **Export quantized model.** + +.. code-block:: python + # define dummy data for model export. input_shape={'data': [10, 3, 224, 224]} - convert_deploy(model, BackendType.Tensorrt, input_shape) #! line 4. remove quant nodes, ready for deploying to real-world hardware + convert_deploy(model, backend, input_shape) #! line 4. remove quant nodes, ready for deploying to real-world hardware If you want to know more about deploying to a customize backend, check :doc:`../user_guide/internal/learn_config` and :doc:`../user_guide/howtodeploy` diff --git a/docs/source/get_started/setup.rst b/docs/source/get_started/setup.rst index fff70ad..fad4c2a 100644 --- a/docs/source/get_started/setup.rst +++ b/docs/source/get_started/setup.rst @@ -1,7 +1,9 @@ -Installation -============ +Setup MQBench +============= -MQBench only depend on PyTorch 1.8.1,following `pytorch.org `_ or use requirements file to install. +**1**. **Install** + +MQBench only depend on PyTorch 1.8.1, following `pytorch.org `_ or use requirements file to install. .. code-block:: shell :linenos: @@ -10,4 +12,22 @@ MQBench only depend on PyTorch 1.8.1,following `pytorch.org `_, and you can find algorithm details in :doc:`../algorithm/advanced_ptq`. +MQBench provides a simple API for advanced PTQ, learn our step-by-step instructions to quantize your model. +**1**. **Prepare FP32 model firstly.** .. code-block:: python - :linenos: import torchvision.models as models from mqbench.convert_deploy import convert_deploy @@ -18,54 +15,71 @@ You can follow this snippet to start your mission with MQBench! You can find con # first, initialize the FP32 model with pretrained parameters. model = models.__dict__["resnet18"](pretrained=True) + model.eval() - # then, we will trace the original model using torch.fx and \ - # insert fake quantize nodes according to different hardware backends (e.g. TensorRT). - model = prepare_by_platform(model, BackendType.Tensorrt) +**2**. **Configure advanced ptq and backend.** - # before training, we recommend to enable observers for calibration in several batches, and then enable quantization. - model.eval() +.. code-block:: python + + # configuration + ptq_reconstruction_config = { + 'pattern': 'block', #? 'layer' for Adaround or 'block' for BRECQ + 'scale_lr': 4.0e-5, #? learning rate for learning step size of activation + 'warm_up': 0.2, #? 0.2 * max_count iters without regularization to floor or ceil + 'weight': 0.01, #? loss weight for regularization item + 'max_count': 20000, #? optimization iteration + 'b_range': [20,2], #? beta decaying range + 'keep_gpu': True, #? calibration data restore in gpu or cpu + 'round_mode': 'learned_hard_sigmoid', #? ways to reconstruct the weight, currently only support learned_hard_sigmoid + 'prob': 0.5, #? dropping probability of QDROP + } + + # backend options + backend = BackendType.Tensorrt + # backend = BackendType.SNPE + # backend = BackendType.PPLW8A16 + # backend = BackendType.NNIE + # backend = BackendType.Vitis + # backend = BackendType.ONNX_QNN + # backend = BackendType.PPLCUDA + # backend = BackendType.OPENVINO + # backend = BackendType.Tengine_u8 + # backend = BackendType.Tensorrt_NLP + +**3**. **Prepare to quantize the model.** + +.. code-block:: python + + # trace model and add quant nodes for model on backend + model = prepare_by_platform(model, backend) + + # calibration loop enable_calibration(model) - calibration_flag = True - - # set config - config_dict = { - pattern: 'block', - warm_up: 0.2, - weight: 0.01, - max_count: 10000, - b_range: [20, 2], - keep_gpu: True, - round_mode: learned_hard_sigmoid, - prob: 1.0 - } + for i, batch in enumerate(data): + # do forward procedures + ... # ptq_reconstruction loop stacked_tensor = [] - # add calibration data to stack + # add ptq_reconstruction data to stack for i, batch_data in enumerate(data): if i == cali_batchsize: break stacked_tensor.append(batch_data) - # start calibration - enable_quantization(model) - model = ptq_reconstruction(model, stacked_tensor, config_dict) + # start ptq_reconstruction + model = ptq_reconstruction(model, stacked_tensor, ptq_reconstruction_config) - # do evaluation - ... + # evaluation loop + for i, batch in enumerate(data): + # do forward procedures + ... - # deploy model, remove fake quantize nodes, and dump quantization params like clip ranges. - convert_deploy(model.eval(), BackendType.Tensorrt, input_shape_dict={'data': [10, 3, 224, 224]}) +**4**. **Export quantized model.** -MQBench examples -^^^^^^^^^^^^^^^^^ +.. code-block:: python -We follow the `PyTorch official example `_ to build the example of Model Quantization Benchmark for ImageNet classification task, you can run advanced ptq easily. + # deploy model, remove fake quantize nodes, and dump quantization params like clip ranges. + input_shape={'data': [10, 3, 224, 224]} + convert_deploy(model, backend, input_shape) -1. Clone and install MQBench; -2. Prepare the ImageNet dataset from `the official website `_ and move validation images to labeled subfolders, using the following `shell script `_; -3. Download pre-trained models from our `release `_; -4. Check out `/path-of-MQBench/application/imagenet_example/PTQ/configs` and find yaml file you want to reproduce; -5. Replace `/path-of-pretained` and `/path-of-imagenet` in yaml file; -6. Change directory, `cd /path-of-MQBench/application/imagenet_example/PTQ/ptq`; -7. Exec `python ptq.py -\-config /path-of-config.yaml`. +you can find algorithm details in :doc:`../algorithm/advanced_ptq`. We also provides an example in `here `_. diff --git a/docs/source/user_guide/PTQ/naive.rst b/docs/source/user_guide/PTQ/naive.rst index 2082fcb..d233486 100644 --- a/docs/source/user_guide/PTQ/naive.rst +++ b/docs/source/user_guide/PTQ/naive.rst @@ -1,21 +1,43 @@ Naive PTQ ========= -MQBench provides a simple API for naive PTQ, learn our step-by-step instructions to quantize your model. You can also see :doc:`../../get_started/quick_start_academic` for more details. +MQBench provides a simple API for naive PTQ, learn our step-by-step instructions to quantize your model. + +**1**. **To begin with, let's import MQBench and prepare FP32 model.** .. code-block:: python - :linenos: - import torchvision.models as models # PyTorch model + import torchvision.models as models # for example model from mqbench.prepare_by_platform import prepare_by_platform # add quant nodes for specific Backend from mqbench.prepare_by_platform import BackendType # contain various Backend, like TensorRT, NNIE, etc. from mqbench.utils.state import enable_calibration # turn on calibration algorithm, determine scale, zero_point, etc. from mqbench.utils.state import enable_quantization # turn on actually quantization, like FP32 -> INT8 + from mqbench.convert_deploy import convert_deploy # remove quant nodes for deploy model = models.__dict__["resnet18"](pretrained=True) # use vision pre-defined model model.eval() - model = prepare_by_platform(model, BackendType.Tensorrt) #! line 1. trace model and add quant nodes for model on Tensorrt Backend +**2**. **Choose your backend.** + +.. code-block:: python + + # backend options + backend = BackendType.Tensorrt + # backend = BackendType.SNPE + # backend = BackendType.PPLW8A16 + # backend = BackendType.NNIE + # backend = BackendType.Vitis + # backend = BackendType.ONNX_QNN + # backend = BackendType.PPLCUDA + # backend = BackendType.OPENVINO + # backend = BackendType.Tengine_u8 + # backend = BackendType.Tensorrt_NLP + +**3**. **The next step prepares to quantize the model.** + +.. code-block:: python + + model = prepare_by_platform(model, backend) #! line 1. trace model and add quant nodes for model on backend enable_calibration(model) #! line 2. turn on calibration, ready for gathering data # calibration loop @@ -30,4 +52,12 @@ MQBench provides a simple API for naive PTQ, learn our step-by-step instructions # do forward procedures ... +**4**. **Export quantized model.** + +.. code-block:: python + + # define dummy data for model export. + input_shape={'data': [10, 3, 224, 224]} + convert_deploy(model, backend, input_shape) #! line 4. remove quant nodes, ready for deploying to real-world hardware + Now you know how to conduct naive PTQ with MQBench, if you want to know more about customize backend check :doc:`../internal/learn_config`. \ No newline at end of file diff --git a/docs/source/user_guide/PTQ/nlp.rst b/docs/source/user_guide/PTQ/nlp.rst new file mode 100644 index 0000000..18b4d9e --- /dev/null +++ b/docs/source/user_guide/PTQ/nlp.rst @@ -0,0 +1,4 @@ +PTQ of Bert +=========== + +MQBench provides an example of Bert model. See `nlp_example `_ for more details. diff --git a/docs/source/user_guide/QAT/detection.rst b/docs/source/user_guide/QAT/detection.rst index ed7b4b9..8d452c7 100644 --- a/docs/source/user_guide/QAT/detection.rst +++ b/docs/source/user_guide/QAT/detection.rst @@ -1,21 +1,21 @@ -Object detection with Mqbench +Object detection with MQBench ================================ -This part, we introduce how to quantize an object detection model using mqbench. + +This part, we introduce how to quantize an object detection model using MQBench. Getting Started ------------------ +--------------- **1**. **Clone the repositories.** -.. code-block:: python +.. code-block:: shell git clone https://github.com/ModelTC/MQBench.git git clone https://github.com/ModelTC/EOD.git - **2**. **Quantization aware training.** -.. code-block:: python +.. code-block:: shell # Prepare your float pretrained model. cd eod/scripts @@ -26,64 +26,59 @@ Getting Started **We have several examples of qat config in EOD repository:** For retinanet-tensorrt: - - float pretrained config file: configs/det/retinanet/retinanet-r18-improve.yaml - - qat config file: configs/det/retinanet/retinanet-r18-improve_quant_trt.yaml + - float pretrained config file: retinanet-r18-improve.yaml + - qat config file: retinanet-r18-improve_quant_trt_qat.yaml For yolox-tensorrt: - - float pretrained config file: configs/det/retinanet/yolox_s_ret_a1_comloc.yaml - - qat config file: configs/det/retinanet/yolox_s_ret_a1_comloc_quant_trt.yaml + - float pretrained config file: yolox_s_ret_a1_comloc.yaml + - qat config file: yolox_s_ret_a1_comloc_quant_trt_qat.yaml For yolox-vitis: - - float pretrained config file: configs/det/yolox/yolox_fpga.yaml - - qat config file: configs/det/yolox/yolox_fpga_quant_vitis.yaml - + - float pretrained config file: yolox_fpga.yaml + - qat config file: yolox_fpga_quant_vitis_qat.yaml **Something import in config file:** - - deploy_backend: Choose your deploy backend supported in mqbench. + - deploy_backend: Choose your deploy backend supported in MQBench. - ptq_only: If True, only ptq will be executed. If False, qat will be executed after ptq calibration. - - extra_qconfig_dict: Choose your quantization config supported in mqbench. + - extra_qconfig_dict: Choose your quantization config supported in MQBench. - leaf_module: Prevent torch.fx tool entering the module. - extra_quantizer_dict: Add some qat modules. - resume_model: The path to your float pretrained model. - tocaffe_friendly: It is recommended to set it to true, which will affect the output onnx model. - **3**. **Resume training during qat.** -.. code-block:: python +.. code-block:: shell cd eod/scripts # just set resume_model in config file to your model, we will do all the rest. - sh train_quant.sh + sh train_qat.sh **4**. **Evaluate your quantized model.** -.. code-block:: python +.. code-block:: shell cd eod/scripts # set resume_model in config file to your model - # add -e to train_quant.sh - sh train_quant.sh - + # add -e to train_qat.sh + sh train_qat.sh **5**. **Deploy.** -.. code-block:: python +.. code-block:: shell cd eod/scripts # Follow the prompts to set config in quant_deploy.sh. - sh quant_deploy.sh - - + sh qat_deploy.sh -Introduction of EOD-Mqbench Project +Introduction of EOD-MQBench Project ---------------------------------------- -The training codes start in eod/commands/train.py. The delpoy codes start in eod/commands/quant_deploy.py. +Code related to quantization is in eod/tasks/quant. -When you set the runner type to quant in config file, QuantRunner will be executed in eod/runner/quant_runner.py. +When you set the runner type to quant in config file, QuantRunner will be executed in eod/tasks/quant/runner/quant_runner.py. 1. Firstly, build your float model in self.build_model(). 2. Load your float pretrained model/quantized model in self.load_ckpt(). @@ -91,21 +86,10 @@ When you set the runner type to quant in config file, QuantRunner will be execut 4. Set your optimization and lr scheduler in self.build_trainer(). 5. Ptq and eval in self.calibrate() 6. Train in self.train() - **Something important:** - - Your model should be splited into network and post-processing. Fx should only trace the network. + - Your model should be split into network and post-processing. Fx should only trace the network. - Quantized model should be saved with the key of qat, as shown in self.save(). This will be used in self.resume_model_from_fp() and self.resume_model_from_quant(). - We disable the ema in qat. If your ckpt has ema state, we will load ema state into model, as shown in self.load_ckpt(). - Be careful when your quantized model has extra learnable parameters. You can check it in optimizer, such as eod/tasks/det/plugins/yolov5/utils/optimizer_helper.py. Lsq has been checked. - - When you are going to deploy model, self.model.deploy should be set to True, as shown in eod/apis/quant_deploy.py. This will remove redundant nodes in your model. - - - - - - - - - diff --git a/docs/source/user_guide/QAT/naive.rst b/docs/source/user_guide/QAT/naive.rst index fa1762d..dc17228 100644 --- a/docs/source/user_guide/QAT/naive.rst +++ b/docs/source/user_guide/QAT/naive.rst @@ -1,10 +1,11 @@ Naive QAT -============ +========= -The training only requires some additional operations compared to ordinary fine-tune. +The quantization aware training only requires some additional operations compared to ordinary fine-tune. + +**1**. **Prepare FP32 model firstly.** .. code-block:: python - :linenos: import torchvision.models as models from mqbench.convert_deploy import convert_deploy @@ -15,22 +16,39 @@ The training only requires some additional operations compared to ordinary fine- model = models.__dict__["resnet18"](pretrained=True) model.train() - # then, we will trace the original model using torch.fx and \ - # insert fake quantize nodes according to different hardware backends (e.g. TensorRT). - model = prepare_qat_fx_by_platform(model, BackendType.Tensorrt) +**2**. **Choose your backend.** - # before training, we recommend to enable observers for calibration in several batches, and then enable quantization. - model.eval() - enable_calibration(model) +.. code-block:: python + + # backend options + backend = BackendType.Tensorrt + # backend = BackendType.SNPE + # backend = BackendType.PPLW8A16 + # backend = BackendType.NNIE + # backend = BackendType.Vitis + # backend = BackendType.ONNX_QNN + # backend = BackendType.PPLCUDA + # backend = BackendType.OPENVINO + # backend = BackendType.Tengine_u8 + # backend = BackendType.Tensorrt_NLP + +**3**. **Prepares to quantize the model.** + +.. code-block:: python + + # trace model and add quant nodes for model on backend + model = prepare_by_platform(model, backend) # calibration loop + model.eval() + enable_calibration(model) for i, batch in enumerate(data): # do forward procedures ... + # training loop model.train() enable_quantization(model) - # training loop for i, batch in enumerate(data): # do forward procedures ... @@ -38,7 +56,12 @@ The training only requires some additional operations compared to ordinary fine- # do backward and optimization ... - # deploy model, remove fake quantize nodes and dump quantization params like clip ranges. - convert_deploy(model.eval(), BackendType.Tensorrt, input_shape_dict={'data': [10, 3, 224, 224]}) +**4**. **Export quantized model.** + +.. code-block:: python + + # define dummy data for model export. + input_shape={'data': [10, 3, 224, 224]} + convert_deploy(model, backend, input_shape) Now you know how to conduct naive QAT with MQBench, if you want to know more about customize backend check :doc:`../internal/learn_config`. \ No newline at end of file diff --git a/docs/source/user_guide/howtoptq.rst b/docs/source/user_guide/howtoptq.rst index 6aba8ca..30c904d 100644 --- a/docs/source/user_guide/howtoptq.rst +++ b/docs/source/user_guide/howtoptq.rst @@ -6,3 +6,4 @@ How to conduct PTQ Naive PTQ Advanced PTQ + NLP PTQ diff --git a/docs/source/user_guide/internal/learn_config.rst b/docs/source/user_guide/internal/learn_config.rst index 0aa252a..93c7ee1 100644 --- a/docs/source/user_guide/internal/learn_config.rst +++ b/docs/source/user_guide/internal/learn_config.rst @@ -1,6 +1,9 @@ Learn MQBench configuration =========================== +Configuration +^^^^^^^^^^^^^ + MQBench provides a primary API **prepare_by_platform** for users to quantize their model. MQBench contains many backends presets for **hardware alignment**, but you maybe want to customize your backend. We provide a guide for learning MQBench configuration, and it will be helpful. @@ -37,8 +40,6 @@ We provide a guide for learning MQBench configuration, and it will be helpful. prepared = prepare_by_platform(model, backend, extra_config) -**3.** **Now MQBench support this Observers and Quantizers** - Observer ^^^^^^^^