Skip to content

Conversation

@rsmallblue
Copy link
Collaborator

@rsmallblue rsmallblue commented Sep 2, 2025

This PR modifies the hadamard_block_size to be loaded from config.json.
Example:

{
  "architectures": [
    "Ernie4_5_MoeForCausalLM"
  ],
  "bos_token_id": 1,
  "eos_token_id": 2,
  "dtype": "bfloat16",
  "hidden_act": "silu",
  "hidden_size": 8192,
  "intermediate_size": 28672,
  "max_position_embeddings": 131072,
  "model_type": "ernie4_5_moe",
  "num_attention_heads": 64,
  "num_key_value_heads": 8,
  "num_hidden_layers": 54,
  "pad_token_id": 0,
  "rms_norm_eps": 1e-05,
  "use_cache": false,
  "vocab_size": 103424,
  "rope_theta": 500000,
  "use_rmsnorm": true,
  "use_bias": false,
  "moe_num_experts": 64,
  "moe_layer_start_index": 3,
  "moe_intermediate_size": 3584,
  "moe_capacity": [64,64,64],
  "moe_gate": "topk",
  "moe_k": 8,
  "moe_layer_interval": 1,
  "moe_use_aux_free": true,
  "num_nextn_predict_layers": 1,
  "tie_word_embeddings": false,
  "is_quantized": true,
  "quantization_config":{
    "dense_quant_type":"block_wise_fp8",
    "moe_quant_type":"w4a8",
    "quantization":"mix_quant",
    "is_permuted": true,
    "hadamard_block_size": 512
  }
}

@paddle-bot
Copy link

paddle-bot bot commented Sep 2, 2025

Thanks for your contribution!

@yangjianfengo1
Copy link
Contributor

LGTM

self.moe_quant_type,
used_in_ep_low_latency,
estimate_total_token_nums,
getattr(layer, "hadamard_block_size", 512),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

其他地方默认值是128,这里是512,需要统一一下

@rsmallblue rsmallblue changed the title load hadamard_block_size from config [fix]load hadamard_block_size from config Sep 2, 2025
@RichardWooSJTU RichardWooSJTU merged commit 2cf5516 into PaddlePaddle:develop Sep 5, 2025
25 of 28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants