Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this framework support "what you serve is what you train" for weight only quantization? #179

Closed
chenho74 opened this issue Jun 5, 2023 · 5 comments

Comments

@chenho74
Copy link

chenho74 commented Jun 5, 2023

Keeping activations as float while quantizing the weights to int8 or int4. I see that currently how it is done is by using float * float dot product. is there a int * float dot product available?

@chenho74 chenho74 changed the title Does this framework support option to keep activations as float? Does this framework support "what you serve is what you train" for weight only quantization? Jun 5, 2023
@lukaszlew
Copy link
Collaborator

Yes, in the config, you can select it individually.

@chenho74
Copy link
Author

chenho74 commented Jun 5, 2023

Thanks for the speedy reply. Would you mind pointing me to the config field that toggles this?

@lukaszlew
Copy link
Collaborator

https://github.com/google/aqt/blob/main/aqt/common/aqt_config.py#L362

As you see lhs and rhs have completly separate quantization configs.

@chenho74
Copy link
Author

chenho74 commented Jun 5, 2023

Yes, but from what i see in the underlying dot product code, if activation is not quantized, a float * float dot product is used? https://github.com/google/aqt/blob/main/aqt/jax/aqt_dot_general.py#L91 Is this a fake quantization or is this also the arithmetic in serving time?

@lukaszlew
Copy link
Collaborator

Sorry that I missed your comment.
For the dot product to be accelerated, both sides need to have the same type.
You are probably most concerned about weight loading.
We will implement that in AQTv2 within a month.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants