[fix]load hadamard_block_size from config #3797

rsmallblue · 2025-09-02T08:14:19Z

This PR modifies the hadamard_block_size to be loaded from config.json.
Example：

{
  "architectures": [
    "Ernie4_5_MoeForCausalLM"
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "dtype": "bfloat16",
  "hidden_act": "silu",
  "hidden_size": 8192,
  "intermediate_size": 28672,
  "max_position_embeddings": 131072,
  "model_type": "ernie4_5_moe",
  "num_attention_heads": 64,
  "num_key_value_heads": 8,
  "num_hidden_layers": 54,
  "pad_token_id": 0,
  "rms_norm_eps": 1e-05,
  "use_cache": false,
  "vocab_size": 103424,
  "rope_theta": 500000,
  "use_rmsnorm": true,
  "use_bias": false,
  "moe_num_experts": 64,
  "moe_layer_start_index": 3,
  "moe_intermediate_size": 3584,
  "moe_capacity": [64,64,64],
  "moe_gate": "topk",
  "moe_k": 8,
  "moe_layer_interval": 1,
  "moe_use_aux_free": true,
  "num_nextn_predict_layers": 1,
  "tie_word_embeddings": false,
  "is_quantized": true,
  "quantization_config":{
    "dense_quant_type":"block_wise_fp8",
    "moe_quant_type":"w4a8",
    "quantization":"mix_quant",
    "is_permuted": true,
    "hadamard_block_size": 512
  }
}

paddle-bot · 2025-09-02T08:14:24Z

Thanks for your contribution!

yangjianfengo1 · 2025-09-02T08:28:50Z

LGTM

K11OntheBoat · 2025-09-02T08:30:09Z

fastdeploy/model_executor/layers/moe/fused_moe_cutlass_backend.py

            self.moe_quant_type,
            used_in_ep_low_latency,
            estimate_total_token_nums,
+            getattr(layer, "hadamard_block_size", 512),


其他地方默认值是128，这里是512，需要统一一下

yangjianfengo1 approved these changes Sep 2, 2025

View reviewed changes

K11OntheBoat reviewed Sep 2, 2025

View reviewed changes

rsmallblue force-pushed the w4a8-config branch from 0f92f67 to d2abad4 Compare September 2, 2025 09:00

rsmallblue changed the title ~~load hadamard_block_size from config~~ [fix]load hadamard_block_size from config Sep 2, 2025

rsmallblue force-pushed the w4a8-config branch from d2abad4 to 6131d29 Compare September 3, 2025 07:06

load hadamard_block_size from config

3cf7c70

rsmallblue force-pushed the w4a8-config branch from 2c2d625 to 3cf7c70 Compare September 4, 2025 06:27

YuanRisheng added the skip-ci: coverage label Sep 5, 2025

RichardWooSJTU approved these changes Sep 5, 2025

View reviewed changes

RichardWooSJTU merged commit 2cf5516 into PaddlePaddle:develop Sep 5, 2025
25 of 28 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[fix]load hadamard_block_size from config #3797

[fix]load hadamard_block_size from config #3797

Uh oh!

rsmallblue commented Sep 2, 2025 •

edited

Loading

Uh oh!

paddle-bot bot commented Sep 2, 2025

Uh oh!

yangjianfengo1 commented Sep 2, 2025

Uh oh!

K11OntheBoat Sep 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[fix]load hadamard_block_size from config #3797

[fix]load hadamard_block_size from config #3797

Uh oh!

Conversation

rsmallblue commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paddle-bot bot commented Sep 2, 2025

Uh oh!

yangjianfengo1 commented Sep 2, 2025

Uh oh!

K11OntheBoat Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

rsmallblue commented Sep 2, 2025 •

edited

Loading