Skip to content

add fp8linear #10488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

add fp8linear #10488

wants to merge 4 commits into from

Conversation

wanderHZ
Copy link
Contributor

PR types

New features

PR changes

APIs

Description

qat-fp8linear

Copy link

paddle-bot bot commented Apr 24, 2025

Thanks for your contribution!

@CLAassistant
Copy link

CLAassistant commented Apr 24, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ wanderHZ
❌ liukebin


liukebin seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

target_x = x
block_size = 1

if act_scale is not None:
Copy link
Contributor

@lugimzzz lugimzzz Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if act_scale is not None:
        if training:
            scale = paddle.max(paddle.abs(target_x)) / qmax + quantization_config.epsilon
            if state < quantization_config.skip_first_act_scale_step:
                act_scale.set_value((state * act_scale + scale) / (state + 1))
            else:
                act_scale.set_value((1-quantization_config.moving_rate)*act_scale + quantization_config.moving_rate*scale)
                scale = act_scale

        else:
            scale = act_scale
    else:
        scale = paddle.max(paddle.abs(target_x)) / qmax + quantization_config.epsilon
``` 量化scale统计建议改成这种方式,之前的方式实验的时候发现会训练一段时间后突刺或者loss就不收敛了

Copy link

codecov bot commented May 8, 2025

Codecov Report

Attention: Patch coverage is 9.94475% with 163 lines in your changes missing coverage. Please review.

Project coverage is 48.66%. Comparing base (e8a19d3) to head (170a4d7).
Report is 30 commits behind head on develop.

Files with missing lines Patch % Lines
paddlenlp/quantization/qat_utils.py 7.28% 140 Missing ⚠️
paddlenlp/utils/optimizer.py 7.69% 12 Missing ⚠️
paddlenlp/quantization/quantization_utils.py 12.50% 7 Missing ⚠️
paddlenlp/quantization/quantization_linear.py 0.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #10488      +/-   ##
===========================================
- Coverage    48.67%   48.66%   -0.01%     
===========================================
  Files          768      768              
  Lines       126915   127101     +186     
===========================================
+ Hits         61777    61860      +83     
- Misses       65138    65241     +103     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@wanderHZ wanderHZ closed this May 8, 2025
@wanderHZ wanderHZ deleted the fp8linear branch May 8, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants