-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Description
🚀 The feature, motivation and pitch
AOT inductor looks like the upcoming means to do inference from native code that was trained in pytorch, and the replacement for torchcript export to native code. It's clear this interface is in prototype status, but based on what is present right now, it's problematic for many users.
torch._export.aot_compile, as currently defined, produces a .so and presumably invokes nvcc for gpu models, and likely a host compiler for cpu models. This is pretty problematic for integration into many native build tools, as the export process takes over building of the inference library. Cross compilation is impossible, as is passing flags to the build tools.
An interface that would potentially be much friendlier would yield source code that could be fed into an existing build system rather than directly providing a library. This way pytorch wouldn't have to manage build tools in any capacity. There is likely a lot of complexity here, because code generation likely wants to hardcode many platform specific details, e.g. gpu type, cpu instruction sets. Being able to specify the platform constraints and capabilities on the aoti_compile interface would likely be wise, rather than attempting to automatically infer it from the local machine.
Alternatives
No response
Additional context
No response
cc @ezyang @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @aakhundov @avikchaudhuri @gmagogsfm @zhxchen17 @tugsbayasgalan @angelayi @suo @ydwu4 @anijain2305 @peterbell10 @msaroufim @wconstab @bdhirsh @zou3519