Skip to content

Redundant scaling in quantize.py #2

@SubSir

Description

@SubSir

Description

I noticed a small typo in the quantization function within model/quantize.py. It's a logical error and it does not affect the final output because scale is typically 1 when called.

Location

File: model/quantize.py
Line: [L268](https://github.com/actypedef/ARCQuant/blob/main/model/quantize.py#L268)

return torch.cat([q_x, q_error_k], dim=1) * scale, scale_x, scale

The Issue

The return value is already multiplied by scale here. However, in model/qLinearLayer.py at [L71](https://github.com/actypedef/ARCQuant/blob/main/model/qLinearLayer.py#L71), the output is multiplied by scale again:

y = F.linear(qx, self.W) * scale * self.scale

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions