Skip to content

Conversation

wenscarl
Copy link
Contributor

@wenscarl wenscarl commented Sep 25, 2024

  1. Address type mismatches or inconsistencies in scale factors for FP8 operations.
  2. Refactor the q-dot-dq (direct quantization pattern) into smaller, more granular components to enable greater flexibility.
    @kaixih @levskaya
    Support fp8 direct quantization praxis#69 depends on this PR.

wenscarl

This comment was marked as off-topic.

@levskaya
Copy link
Collaborator

I'm not able to do a deep review of the fp8 quant details, but everything looks fine on the surface - you just need to correct one trailing space issue that's blocking our formatting presubmits.

I can't tell if this is ready to go in, but if it is let me know and I can merge it in!

@wenscarl
Copy link
Contributor Author

I'm not able to do a deep review of the fp8 quant details, but everything looks fine on the surface - you just need to correct one trailing space issue that's blocking our formatting presubmits.

I can't tell if this is ready to go in, but if it is let me know and I can merge it in!

Thanks for reviewing it. The formatting is fixed. It's ready to go in from our e2e test.

@wenscarl wenscarl requested a review from levskaya September 27, 2024 18:51
@levskaya
Copy link
Collaborator

@wenscarl - I'm seeing actual test failures here? is it just a tolerance issue or something more serious?

@wenscarl
Copy link
Contributor Author

@levskaya It was a typo and fixed. All tests passed.

@copybara-service copybara-service bot merged commit b9bbc98 into google:main Oct 1, 2024
17 of 18 checks passed
copybara-service bot pushed a commit to google/praxis that referenced this pull request Oct 2, 2024
PiperOrigin-RevId: 681480827
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants