Skip to content

Add more supported base_configs to QATConfig #2992

@andrewor14

Description

@andrewor14

Today we have:

Int8DynamicActivationInt4WeightConfig
Int4WeightOnlyConfig
Float8DynamicActivationFloat8WeightConfig
Float8DynamicActivationInt4WeightConfig
NVFP4InferenceConfig

We should add:

(for lowering to executorch)

  • Int8DynamicActivationIntxWeightConfig
  • IntxWeightOnlyConfig

(requested by Axolotl, see axolotl-ai-cloud/axolotl#3107)

  • Int8DynamicActivationInt8WeightConfig
  • Int4DynamicActivationInt4WeightConfig
  • Int8WeightOnlyConfig

Registration happens here today:

def _infer_fake_quantize_configs(

Example usage:

# Not supported yet today
base_config = Int8DynamicActivationIntxWeightConfig(group_size=32)
quantize_(model, QATConfig(base_config, step="prepare"))
train(model)
quantize_(model, QATConfig(base_config, step="convert"))

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions