[Relay][Frontend][QNN] fix access `param_debug_name_map` to node output name in fx-quantized graph node replacement #16217

PineApple777 · 2023-12-08T10:38:30Z

This PR solves the problem of not finding the key of param_debug_name_map with "name", an attribute of prim::GetAttr, when inline_input_quant_params_for_fx replacing prim::GetAttr with prim::Constant. This problem occurs when the model is quantized by each sub-models with quantize_fx, and the zero point and scale variable names are different from the name, the attribute of prim::GetAttr. To solve this, when modifying all prim::GetAttr to prim::Constant, modify to access param_debug_name_map by using the node.output.debugName() . The reason is that nodes.output().debugName() as shown pattern like .<number> at the end of their names, which distinguishes same name and that can be found in param_debug_name_map

simple example

import torch
import tvm.relay as relay
from torch.ao.quantization import quantize_fx, get_default_qconfig_mapping, get_default_qconfig

class SimpleExample(torch.nn.Module):
    def __init__(self, in_feature, out_feature):
        super(SimpleExample, self).__init__()
        self.simple_dense_1 = torch.nn.Linear(in_feature, out_feature)
        self.simple_dense_2 = torch.nn.Linear(out_feature, out_feature)
        
    def forward(self, x):
        x = self.simple_dense_1(x)
        x = self.simple_dense_2(x)
        return x
    
class SimpleModel(torch.nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.layer = torch.nn.ModuleList()
        self.layer.append(SimpleExample(128, 128))
        self.layer.append(SimpleExample(128, 128))
        self.layer.append(SimpleExample(128, 128))

    def forward(self, x):
        x = self.layer[0](x)
        x = self.layer[1](x)
        x = self.layer[2](x)
        return x
    
model = SimpleModel()
random_sample = torch.randn([128, 128])

default_qconfig_mapping = get_default_qconfig_mapping()
qconfig = get_default_qconfig()
default_qconfig_mapping.set_global(qconfig)

# quantize per layer
for child_names, child_mods in model.named_children():
    for child_name, child_mod in child_mods.named_children():
        mod_int8 = quantize_fx.prepare_fx(child_mod, qconfig_mapping=default_qconfig_mapping, example_inputs=random_sample)
        qmod = quantize_fx.convert_fx(mod_int8, qconfig_mapping=default_qconfig_mapping)
        model.layer[int(child_name)] = qmod

# simple random data calibrate
for i in range(100):
    calib_sample = torch.randn([128, 128])
    model(calib_sample)

scripted_model = torch.jit.trace(model, example_inputs=random_sample)

input_infos = [
    ("x", ([128, 128], "float32"))
]
tvm_model = relay.frontend.from_pytorch(scripted_model, input_infos=input_infos, keep_quantized_weight=True)

When quantization is performed for each sub-model, different zero point and scale values are calibrated for each layer.

simple example error log

Traceback (most recent call last):
  File "example.py", line 67, in <module>
    tvm_model = relay.frontend.from_pytorch(scripted_model, input_infos=input_infos, keep_quantized_weight=True)
  File "/home/sunwook/tvm/python/tvm/relay/frontend/pytorch.py", line 5385, in from_pytorch
    qnn_torch.inline_input_quant_params_for_fx(graph, tensors, param_debug_name_map)
  File "/home/sunwook/tvm/python/tvm/relay/frontend/qnn_torch.py", line 555, in inline_input_quant_params_for_fx
    full_attr = param_debug_name_map[get_full_attr_name(node)]
KeyError: 'layer.0.simple_dense_1_input_zero_point_0'

we can find simple_dense_1_input_zero_point_0.7 in param_debug_name_map, but original code recursively find all attribute name in children model, for example, layer.0.simple_dense_1_input_zero_point_0.

converted torchscript

...
// simple_dense_1_input_zero_point_0.7 != layer.0.simple_dense_1_input_zero_point_0
%simple_dense_1_input_zero_point_0.7 : Tensor = prim::GetAttr[name="simple_dense_1_input_zero_point_0"](%_0)
%simple_dense_1_input_scale_0.7 : Tensor = prim::GetAttr[name="simple_dense_1_input_scale_0"](%_0)
%18 : QUInt8(128, 128, strides=[128, 1], requires_grad=0, device=cpu) = aten::quantize_per_tensor(%x.1, %simple_dense_1_input_scale_0.7, %simple_dense_1_input_zero_point_0.7, %13)
...
// simple_dense_1_input_zero_point_0.9 != layer.1.simple_dense_1_input_zero_point_0
%simple_dense_1_input_zero_point_0.9 : Tensor = prim::GetAttr[name="simple_dense_1_input_zero_point_0"](%_1)
%simple_dense_1_input_scale_0.9 : Tensor = prim::GetAttr[name="simple_dense_1_input_scale_0"](%_1)
%33 : QUInt8(128, 128, strides=[128, 1], requires_grad=0, device=cpu) = aten::quantize_per_tensor(%x.3, %simple_dense_1_input_scale_0.9, %simple_dense_1_input_zero_point_0.9, %28)
...

keys in param_debug_name_map

dict_keys(['simple_dense_1_input_zero_point_0', 'simple_dense_1_input_scale_0', 'simple_dense_1_input_zero_point_0.9', 'simple_dense_1_input_scale_0.9', 'simple_dense_1_input_zero_point_0.7', 'simple_dense_1_input_scale_0.7'])

PineApple777 · 2023-12-27T04:13:47Z

This PR quite simple bugfix about the mismatch of accessing parameter keys. could you please review this pr @masahi

update qnn_torch.py

0952a71

PineApple777 changed the title ~~[Relay][Frontend][QNN] fix getting full node attribute name in fx-based quantized graphs~~ [Relay][Frontend][QNN] Fix getting full node attribute name in fx-based quantized graphs Dec 8, 2023

remove unused function

53aee6b

PineApple777 changed the title ~~[Relay][Frontend][QNN] Fix getting full node attribute name in fx-based quantized graphs~~ [Relay][Frontend][QNN] fix getting full node attribute name in quantized fx-graph Dec 23, 2023

PineApple777 changed the title ~~[Relay][Frontend][QNN] fix getting full node attribute name in quantized fx-graph~~ [Relay][Frontend][QNN] fix access param_debug_name_map to node output name in fx-quantized graph node replacement Dec 24, 2023

PineApple777 marked this pull request as ready for review December 24, 2023 12:02

masahi approved these changes Dec 27, 2023

View reviewed changes

masahi merged commit 506eff2 into apache:main Dec 27, 2023
25 checks passed

ysh329 mentioned this pull request Jan 11, 2024

[Release] v0.15.0 Release Candidate Notes #16391

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relay][Frontend][QNN] fix access `param_debug_name_map` to node output name in fx-quantized graph node replacement #16217

[Relay][Frontend][QNN] fix access `param_debug_name_map` to node output name in fx-quantized graph node replacement #16217

PineApple777 commented Dec 8, 2023 •

edited

PineApple777 commented Dec 27, 2023

[Relay][Frontend][QNN] fix access param_debug_name_map to node output name in fx-quantized graph node replacement #16217

[Relay][Frontend][QNN] fix access param_debug_name_map to node output name in fx-quantized graph node replacement #16217

Conversation

PineApple777 commented Dec 8, 2023 • edited

PineApple777 commented Dec 27, 2023

[Relay][Frontend][QNN] fix access `param_debug_name_map` to node output name in fx-quantized graph node replacement #16217

[Relay][Frontend][QNN] fix access `param_debug_name_map` to node output name in fx-quantized graph node replacement #16217

PineApple777 commented Dec 8, 2023 •

edited