Skip to content
This repository was archived by the owner on Jun 3, 2025. It is now read-only.

Conversation

@Satrat
Copy link

@Satrat Satrat commented Jan 30, 2024

These modifiers were previously not FSDP compatible because they updated module weights directly. This PR wraps weight updates in an apply call to work with FSDP.

slack thread of issue: https://neuralmagic.slack.com/archives/C064P557R8B/p1706330735831899

Testing

recipe.yaml

test_stage:
  obcq_modifiers:
    LogarithmicEqualizationModifier:
      mappings: [
        [["re:.*q_proj", "re:.*k_proj", "re:.*v_proj"], "re:.*input_layernorm"],
        [["re:.*gate_proj", "re:.*up_proj"], "re:.*post_attention_layernorm"],
      ] 
    QuantizationModifier:
      ignore:
        # These operations don't make sense to quantize
        - LlamaRotaryEmbedding
        - LlamaRMSNorm
        - SiLUActivation
        - MatMulOutput_QK
        - MatMulOutput_PV
        # Skip quantizing the layers with the most sensitive activations
        - model.layers.1.mlp.down_proj 
        - model.layers.30.mlp.down_proj
        - model.layers.31.mlp.down_proj
        - model.layers.28.mlp.down_proj  
        - model.layers.29.mlp.down_proj   
      post_oneshot_calibration: false
      scheme_overrides:
        Linear:
          weights:
            num_bits: 8
            symmetric: true
            strategy: channel
        MatMulLeftInput_QK:
          input_activations:
            num_bits: 8
            symmetric: true
        MatMulLeftInput_PV:
          input_activations:
            num_bits: 8
            symmetric: true
        Embedding:
          input_activations: null
          weights:
            num_bits: 8
            symmetric: false

Run quantization:
with FSDP: accelerate launch --config_file integrations/huggingface-transformers/finetuning/example_fsdp_config.yaml test_quant.py
without FSDP: python test_quant.py

test_quant.py

from sparseml.transformers.finetune.text_generation import oneshot

model = "mgoin/llama2-7b-gsm8k-pt"
dataset_name = "open_platypus"
concatenate_data = False
output_dir = "./debug_smoothing"
recipe = "recipe.yaml"
overwrite_output_dir = True
splits = {
    "calibration": "train"
}
oneshot(
    model_name_or_path=model,
    dataset_name=dataset_name,
    output_dir=output_dir,
    recipe=recipe,
    overwrite_output_dir=overwrite_output_dir,
    concatenate_data = concatenate_data,
    splits = splits
)

@Satrat Satrat marked this pull request as ready for review January 30, 2024 18:20
@bfineran bfineran merged commit 7ede036 into main Feb 15, 2024
@bfineran bfineran deleted the fsdp_log_modifier branch February 15, 2024 16:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants