Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docs/source/3x/PT_WeightOnlyQuant.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,8 @@ model = convert(model, config) # after this step, the model is ready for W4A8 i
| not_use_best_mse (bool) | Whether to use mean squared error | False |
| dynamic_max_gap (int) | The dynamic maximum gap | -1 |
| scale_dtype (str) | The data type of quantization scale to be used, different kernels have different choices | "float16" |
| scheme (str) | A preset scheme that defines the quantization configurations. | "W4A16" |
| layer_config (dict) | Layer-wise quantization config | None |

``` python
# Quantization code
Expand Down Expand Up @@ -283,6 +285,23 @@ quant_config = RTNConfig()
lm_head_config = RTNConfig(dtype="fp32")
quant_config.set_local("lm_head", lm_head_config)
```
3. Example of using `layer_config` for AutoRound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 'set_local' does not work in current implementation, we should call out this, let user know, AutoRound specific 'layer_config' instead of 'set_local' API should be used. since in AutoRoundConfig, not any one of 3 items work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code will be raised in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do u mean you will implement 'set_local' support by converting it to layer_config so any of 3 options is valid after your another PR merged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. Currently, only the third option is supported. option `1, 2 'will be implemented in phase 2.

```python
# layer_config = {
# "layer1": {
# "data_type": "int",
# "bits": 3,
# "group_size": 128,
# "sym": True,
# },
# "layer2": {
# "W8A16"
# }
# }
# Use the AutoRound specific 'layer_config' instead of the 'set_local' API.
layer_config = {"lm_head": {"data_type": "int"}}
quant_config = AutoRoundConfig(layer_config=layer_config)
```

### Saving and Loading

Expand Down