Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.

[BYOC] Fix FuseOpsByPattern and RunCodegen for calling the same extern multiple times #441

Merged
merged 2 commits into from
Feb 16, 2023

Conversation

masahi
Copy link
Contributor

@masahi masahi commented Feb 15, 2023

The two passes have a bug when calling the same extern function multiple times:

@R.function
def main(
    data: R.Tensor((16, 32, 32, 16), dtype="float16"),
    weight1: R.Tensor((16, 3, 3, 16), dtype="float16"),
    weight2: R.Tensor((16, 3, 3, 16), dtype="float16"),
) -> R.Tensor((16, 32, 32, 16), dtype="float16"):
    with R.dataflow():
        lv = fused_relax_nn_conv2d_tensorrt(data, weight1)
        gv = fused_relax_nn_conv2d_tensorrt(lv, weight2)
        R.output(gv)
    return gv

FuseOpsByPattern: The second call fails at builder_->GetContextIRModule()->Lookup(gvar), since the gvar is removed during the visit to the first call at

builder_->GetContextIRModule()->Remove(GetRef<GlobalVar>(gvar));

RunCodegen: The second call to fused_relax_nn_conv2d_tensorrt doesn't get replaced with the extern func created during the first call, since the kCodegen attribute is removed at

func = (*RemoveFuncAttrFunc)(func, attr::kCodegen);
during the visit to the first call which in turn makes the condition at
auto opt_codegen = func->GetAttr<String>(attr::kCodegen);
if (opt_codegen) {
false during the second visit.

Both issues have been fixed and a new test case is added for CUTLASS that demonstrates kernel sharing between multiple call_tir with the same callee kernel.

@vinx13 @yelite

@vinx13 vinx13 merged commit d08f87e into tlc-pack:relax Feb 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants