prototype quant_logger tool for logging weights and activations#3987
prototype quant_logger tool for logging weights and activations#3987
Conversation
|
Stack from ghstack (oldest at bottom): |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3987
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 8 PendingAs of commit dcf4af7 with merge base d6d423e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Seems useful to me, don't have any major feedback or concerns |
|
Thanks this is cool and in general, it will be useful to filter layers from quantization. Thoughts / questions:
|
|
thanks @sayakpaul , here are my thoughts. Also, once we finalize on the design I do plan to include doc site content and a short tutorial in this PR to explain everything better.
Currently they are named differently because # log parameter info - one liner
log_parameter_info(model)
# log activations - two+ lines
add_activation_loggers(model)
for datum in dataset:
model(datum)
# potential API we could add
# would call `model(datum)` under the hood, less flexible
log_activation_info(model, datum)
Basic things did work but I did not test compile very thoroughly, I deleted compile tests before cleaning up the code to keep things simple. Would be interested to hear what use cases would need compile support for things like numerical debugging.
I think this is very easy for a single model, and unclear how to generalize to arbitrary models as the definition of "layer group", "outlier", etc can all change. Thoughts on if we provide an example of this for |
Makes sense. Your current reasoning explains it. No issues.
SG!
Works for me! |
|
|
||
| import torch | ||
|
|
||
| counter = [0] |
There was a problem hiding this comment.
Why do we use list[int] for counter?
There was a problem hiding this comment.
this is because counter[0] is mutable
Summary:
Simple tool to log activation and weight statistics and shapes. No dependencies other than PyTorch, advanced use cases are left to the user to implement via overrides. Features included in this PR:
summary of proposed API:
usage example on a diffusers model:
output of running the example above:
full output:
I want to check this in so we can have a reproducible tool for getting activation shapes on a blog we are working on with @sayakpaul for diffusion quantization with mxfp8 and nvfp4. This should be useful in general for various quantization debugging use cases.
Prototype folder, so no BC for now. Once we have alignment on general interface, I will also add docs as a part of this PR.
Test Plan: