- 
                Notifications
    
You must be signed in to change notification settings  - Fork 559
 
Do not fail on lack of default precision set. #6139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not fail on lack of default precision set. #6139
Conversation
I discovered some models from the suite do not have the precision set so instead of failing the script we just log the case, and use the default precision, as no additional machinery should run for the Inductor anyway. Additionally I wrapped the exceptions with the ValueError so the logging message will not pollute with info about str not inheriting from Exception class. ecg@, note that needs to be hooked "somewhere". Not sure where, as there was a revert in #6134.
| 
           Thanks Greg -- I had missed that your change was reverted as part of another revert. Please land this (including the PR description when merging, otherwise annoyingly we lose that info in the repo) and I'll try to re-land your reverted change.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment.
Thanks you for finding these thinsg!
          
 Given that Jack is looking into this at the bridge level, I'll wait for him to finish that investigation since that's the right place to fix this.  | 
    
I discovered some models from the suite do not have the default precision set so instead of failing the script we just log the case, and do nothing, as no additional machinery should run for the Inductor anyway. Additionally I wrapped the exceptions with the ValueError so the logging message will not pollute with info about str not inheriting from Exception class. @cota , note that needs to be hooked "somewhere". Not sure where, as there was a revert in pytorch#6134, but in general it can be done prior to moving the model to the device safely.
I discovered some models from the suite do not have the default precision set so instead of failing the script we just log the case, and do nothing, as no additional machinery should run for the Inductor anyway. Additionally I wrapped the exceptions with the ValueError so the logging message will not pollute with info about str not inheriting from Exception class. @cota , note that needs to be hooked "somewhere". Not sure where, as there was a revert in #6134, but in general it can be done prior to moving the model to the device safely.
| elif precision == "amp": | ||
| raise f"AMP for PT/XLA:GPU is not implemented yet for torchbench models" | ||
| raise ValueError( | ||
| f"AMP for PT/XLA:GPU is not implemented yet for torchbench models") | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@miladm @frgossen @golechwierowicz Isn't AMP supported on PT/XLA:GPU?
I discovered some models from the suite do not have the default precision set so instead of failing the script we just log the case, and do nothing, as no additional machinery should run for the Inductor anyway. Additionally I wrapped the exceptions with the ValueError so the logging message will not pollute with info about str not inheriting from Exception class. @cota , note that needs to be hooked "somewhere". Not sure where, as there was a revert in #6134, but in general it can be done prior to moving the model to the device safely.
I discovered some models from the suite do not have the default precision set so instead of failing the script we just log the case, and do nothing, as no additional machinery should run for the Inductor anyway. Additionally I wrapped the exceptions with the ValueError so the logging message will not pollute with info about str not inheriting from Exception class.
@cota , note that needs to be hooked "somewhere". Not sure where, as there was a revert in #6134, but in general it can be done prior to moving the model to the device safely.