add loralinear #10385

lugimzzz · 2025-04-10T12:59:40Z

PR types

New features

PR changes

APIs

Description

loralinear

paddle-bot · 2025-04-10T12:59:45Z

Thanks for your contribution!

codecov · 2025-04-11T20:50:09Z

Codecov Report

Attention: Patch coverage is 19.03766% with 387 lines in your changes missing coverage. Please review.

Project coverage is 48.95%. Comparing base (f232d82) to head (2d50107).
Report is 205 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/quantization/quantization_linear.py	13.20%	92 Missing ⚠️
paddlenlp/peft/lora/lora_quantization_layers.py	16.66%	85 Missing ⚠️
paddlenlp/utils/optimizer.py	8.23%	78 Missing ⚠️
paddlenlp/quantization/qat_utils.py	17.72%	65 Missing ⚠️
paddlenlp/quantization/hadamard_utils.py	10.00%	27 Missing ⚠️
paddlenlp/quantization/quantization_utils.py	28.57%	25 Missing ⚠️
paddlenlp/trainer/trainer.py	28.57%	5 Missing ⚠️
paddlenlp/peft/lora/lora_model.py	77.77%	4 Missing ⚠️
paddlenlp/peft/lora/lora_layers.py	25.00%	3 Missing ⚠️
paddlenlp/transformers/model_utils.py	66.66%	2 Missing ⚠️
... and 1 more

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #10385      +/-   ##
===========================================
- Coverage    49.08%   48.95%   -0.13%     
===========================================
  Files          763      767       +4     
  Lines       125673   126153     +480     
===========================================
+ Hits         61689    61764      +75     
- Misses       63984    64389     +405

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lugimzzz · 2025-04-15T07:17:51Z

paddlenlp/peft/lora/lora_layers.py

        self.disable_lora = False
+        if mp_moe or is_distributed:
+            for p in self.parameters():
+                p.is_distributed = is_distributed


用于EP，is_distributed标识训练开始的时候不要同步参数和mp_moe用于uc

lugimzzz · 2025-04-15T07:18:45Z

paddlenlp/trainer/trainer.py

                level=self.args.fp16_opt_level,
                dtype=self.amp_dtype,
-                excluded_layers=[QuantizationLinear] + self._decorate_exclude_layers(model),
+                excluded_layers=[QuantizationLinear, ColumnParallelQuantizationLinear, RowParallelQuantizationLinear]


防止精度为fp32的量化scale被cast成bf16

lugimzzz · 2025-04-15T07:19:04Z

paddlenlp/transformers/model_utils.py


        # Optimize for skip unused shard files for supper large model
-        if sharded_metadata is not None and quantization_linear_list is None:
+        if sharded_metadata is not None:


skip掉不需要读取的参数分片，加速加载

wawltor · 2025-04-21T02:14:17Z

paddlenlp/peft/lora/lora_quantization_layers.py

@@ -1,4 +1,4 @@
-# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.


这个貌似不用改

wawltor · 2025-04-21T02:21:51Z

paddlenlp/peft/lora/lora_quantization_layers.py

-            new_weight += self.lora_A @ self.lora_B * self.scaling
-            self.quantize_weight(new_weight)
-            self.merged = True
+        mp_moe = getattr(self.quant_weight, "mp_moe", False)


没有太明白这里为什么一定对MoE的参数进行标识

这个是unified checkpoint需要使用

wawltor · 2025-04-21T02:31:45Z

paddlenlp/quantization/qat_utils.py

@@ -0,0 +1,154 @@
+# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
+#


hadamard_utils.py 的来源是来自哪里了？

slim同学给的，现在去掉不需要使用的部分

wawltor · 2025-04-21T02:33:51Z

paddlenlp/quantization/qat_utils.py

+from .hadamard_utils import random_hadamard_matrix
+
+
+def quantize_tensorwise(x, quantization_config=None, bit_length=8, state=0, training=False, act_scale=None):


这里需要单独对QAT用到的量化方法单独一个文件吗？是不是所有的量化方法都在一起比较好

qat方法比较复杂，后续还会加比较多东西，所以单独写一个qat_utils

wawltor · 2025-04-21T02:34:21Z

paddlenlp/quantization/qat_utils.py

+    if quantization_config.apply_hadamard:
+        target_x = x @ infohub.hadamard[x.shape[-1]][0]
+    else:
+        target_x = x.clone()


这里clone的原因？

wawltor · 2025-04-21T02:40:51Z

paddlenlp/quantization/quantization_linear.py

+            input_grad = None
+
+        if not quant_weight.stop_gradient:
+            weight_grad = paddle.einsum("bsh,bsd->hd", x, grad_output)


paddle的einsum在某些场景下有坑，看看是否适合用enisum

果然有问题！einsum比matmul慢好多，我换成matmul了

wawltor

LGTM

wawltor

LGTM

add loralinear

e71e4ce

lugimzzz added 2 commits April 11, 2025 17:13

add wint8lora

7cda038

fix bug

1853267

lugimzzz added 8 commits April 12, 2025 15:32

add a8w8linear

4a7f8e6

add a8w8linear with hadamard

4ed55e4

support fused linear

f48d60b

Merge branch 'develop' of https://github.com/lugimzzz/PaddleNLP into ssp

498b2e1

Merge branch 'develop' of https://github.com/lugimzzz/PaddleNLP into ssp

b6c5dda

support backward

90d2845

Merge branch 'ssp' of https://github.com/lugimzzz/PaddleNLP into ssp

8df82ab

support wint8

78f08ae

lugimzzz commented Apr 15, 2025

View reviewed changes

lugimzzz added 2 commits April 16, 2025 20:59

add a8w8

923b9c8

add a8w8

fffc785

wawltor reviewed Apr 21, 2025

View reviewed changes

fix following comments

63789b4

wawltor previously approved these changes Apr 21, 2025

View reviewed changes

fix following comments

2d50107

lugimzzz dismissed wawltor’s stale review via 2d50107 April 22, 2025 07:36

wawltor approved these changes Apr 23, 2025

View reviewed changes

wawltor merged commit 72e1994 into PaddlePaddle:develop Apr 23, 2025
9 of 15 checks passed

lugimzzz deleted the ssp branch April 25, 2025 08:37

		@@ -1,4 +1,4 @@
		# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
		# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,154 @@
		# Copyright (c) 2025 PaddlePaddle Authors. All Rights Reserved.
		#

		from .hadamard_utils import random_hadamard_matrix


		def quantize_tensorwise(x, quantization_config=None, bit_length=8, state=0, training=False, act_scale=None):

add loralinear #10385

add loralinear #10385

Uh oh!

Conversation

lugimzzz commented Apr 10, 2025

PR types

PR changes

Description

Uh oh!

paddle-bot bot commented Apr 10, 2025

Uh oh!

codecov bot commented Apr 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wawltor left a comment

Choose a reason for hiding this comment

Uh oh!

wawltor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 11, 2025 •

edited

Loading