Skip to content

Conversation

@ajrasane
Copy link
Contributor

What does this PR do?

Type of change:
Example update

Overview:

  • Added flag to real quantize the weights
  • mtq.compress() is only supported for FP8 and NVFP4

Usage

python quantize.py \
    --model flux-dev --model-dtype BFloat16 --trt-high-precision-dtype BFloat16 \
    --format fp8 --batch-size 1 --calib-size 32 --quantize-mha \
    --n-steps 20 --quantized-torch-ckpt-save-path ./flux_dev_fp8_autodeploy.pt --collect-method default \
    --compress

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: No
  • Did you add or update any necessary documentation?: No
  • Did you update Changelog?: No

Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
@ajrasane ajrasane requested a review from a team as a code owner October 28, 2025 05:32
@ajrasane ajrasane requested a review from Edwardf0t1 October 28, 2025 05:32
@ajrasane ajrasane self-assigned this Oct 28, 2025
@ajrasane ajrasane requested a review from cjluo-nv October 28, 2025 05:32
@codecov
Copy link

codecov bot commented Oct 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.38%. Comparing base (41de55f) to head (aeab1d0).
⚠️ Report is 5 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #473   +/-   ##
=======================================
  Coverage   73.38%   73.38%           
=======================================
  Files         180      180           
  Lines       18110    18110           
=======================================
  Hits        13290    13290           
  Misses       4820     4820           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ajrasane ajrasane merged commit 41f2bf4 into main Oct 28, 2025
26 checks passed
@ajrasane ajrasane deleted the ajrasane/diffusers_compress branch October 28, 2025 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants