Skip to content

Conversation

@jambayk
Copy link
Contributor

@jambayk jambayk commented Nov 7, 2025

Describe your changes

  • SelectiveMixedPrecision assigns the same bits to the embeds if the weights are tied in the model
  • Quantization passes use "auto" datatype only if the hf model handler itself doesn't have a torch dtype. This is to respect the data type override if provided.
  • Quantization passes give precedence to user provided overrides over the mixed precision overrides. Mixed precision overrides are added by a pass so there is no way to override its decisions otherwise.
  • torch script export model loads the temporary model using onnx since the ir external data loader doesn't load constants with external data correctly leading to invalid models

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

@jambayk jambayk merged commit 0de6d53 into main Nov 7, 2025
11 checks passed
@jambayk jambayk deleted the jambayk/quant-dtype branch November 7, 2025 17:49
xiaoyu-work pushed a commit that referenced this pull request Nov 10, 2025
…overrides (#2246)

## Describe your changes
- SelectiveMixedPrecision assigns the same bits to the embeds if the
weights are tied in the model
- Quantization passes use "auto" datatype only if the hf model handler
itself doesn't have a torch dtype. This is to respect the data type
override if provided.
- Quantization passes give precedence to user provided overrides over
the mixed precision overrides. Mixed precision overrides are added by a
pass so there is no way to override its decisions otherwise.
- torch script export model loads the temporary model using onnx since
the ir external data loader doesn't load constants with external data
correctly leading to invalid models

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.

## (Optional) Issue link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants