Skip to content

Commit

Permalink
[aotindutor] Forward fix a performance regression
Browse files Browse the repository at this point in the history
Summary: Forward fix a performance regression caused by #110510. When a model is run once, all those kernel pointers are initialized and removing the if-nullptr check will cause those loadKernel be unnecessarily executed again when we rerun the foward function. Another way to do this is to codegen loadKernel in the initializer, which I may in a later PR.

ghstack-source-id: d2d5531df77c4e69c38e0e13c21278ca6943f0f0
Pull Request resolved: #110800
  • Loading branch information
desertfire committed Oct 7, 2023
1 parent d84bcb9 commit c5567da
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions torch/_inductor/codegen/wrapper.py
Original file line number Diff line number Diff line change
Expand Up @@ -2021,13 +2021,17 @@ def generate_load_kernel_once(
self, name: str, mangled_name: str, cubin_path: str, shared_mem: int
):
if V.graph.aot_mode:
self.writeline(f"if (kernels.{name} == nullptr) {{")
self.writeline(
f"""kernels.{name} = loadKernel("{cubin_path}", "{mangled_name}", {shared_mem}, this->cubin_dir_);"""
f""" kernels.{name} = loadKernel("{cubin_path}", "{mangled_name}", {shared_mem}, this->cubin_dir_);"""
)
self.writeline("}")
else:
self.writeline(f"if ({name} == nullptr) {{")
self.writeline(
f"""{name} = loadKernel("{cubin_path}", "{mangled_name}", {shared_mem});"""
f""" {name} = loadKernel("{cubin_path}", "{mangled_name}", {shared_mem});"""
)
self.writeline("}")

def generate_args_decl(self, call_args):
dynamic_symbols = V.graph.sizevars.free_symbols()
Expand Down

0 comments on commit c5567da

Please sign in to comment.