-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable Upcasting and Downcasting to BF16/FP16 around matmul
and linear
operations for the nvFuser Executor
#2054
Comments
There's many pieces that could affect how a matmul will look like in the trace. i.e. grad transform rules / auto cast. (note: I don't know how autocast transform is used in thunder, but looking at the code I think it's helping downcast inputs to reduced precision, so exactly what we wanted. cc'ing @IvanYashchuk / @tfogal who might know better). Overall, thunder doesn't imply any type promotion logic for torch matmul/linear on its decomposition level: So generally I don't think we have any issue with thunder for now. i.e. thunder will be able to show input to matmul/linear in proper dtype as nvfuser would want to see. Example below showing how
Which gives trace of:
|
This is a great and accurate summary, Jie! Thunder's autocast could be renamed to "auto-downcast" as this is what it does today and only for linear, matmul, sdpa. It should be applied behind the scenes automatically if the jitted function is called under
This is true that Thunder upcasts to FP32 in general except matmul, linear, conv, sdpa, and maybe some other special operation. |
Waiting on Priya to verify this works once |
PRs: Lightning-AI/lightning-thunder#318 and Lightning-AI/lightning-thunder#207 enable matmul and linear for nvFuser executor in Thunder. |
In Thunder, the behavior is to explicitly upcast to FP32 and downcast the FP/BF16 around a set of fusion operations. In the case of
matmul
andlinear
operations this would accidentally suggest not to use TensorCores for the operations and, therefore, this casting behavior needs to be changed.Please consult with @jjsjann123 about the appropriate course of action!
The text was updated successfully, but these errors were encountered: