Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIR] Schedule Primitive: Add-Unit-Loop #11575

Merged
merged 2 commits into from
Jun 5, 2022

Conversation

junrushao
Copy link
Member

In TE, a unit loop could be introduced by fusing an empty list of loops on a stage. This PR adds its counterpart in TIR, while being a bit more explicit with a new schedule primitive which adds a unit loop without impacting any existing functionalities.

src/tir/schedule/primitive.h Show resolved Hide resolved
@junrushao junrushao merged commit 9d2c9a7 into apache:main Jun 5, 2022
junrushao added a commit to junrushao/tvm that referenced this pull request Jun 5, 2022
Following apache#11575, this PR allows CUDA thread binding for TIR programs
like

```python
@T.prim_func
def zero_dim_add(
    A: T.Buffer[(), "float32"],
    B: T.Buffer[(), "float32"],
    C: T.Buffer[(), "float32"],
) -> None:
    with T.block("C"):
        vi = T.axis.spatial(1, 0)
        C[()] = A[()] + B[()]
```

where there is no loop available to be bound to threadIdx/blockIdx.
spectrometerHBH pushed a commit that referenced this pull request Jun 5, 2022
Following #11575, this PR allows CUDA thread binding for TIR programs
like

```python
@T.prim_func
def zero_dim_add(
    A: T.Buffer[(), "float32"],
    B: T.Buffer[(), "float32"],
    C: T.Buffer[(), "float32"],
) -> None:
    with T.block("C"):
        vi = T.axis.spatial(1, 0)
        C[()] = A[()] + B[()]
```

where there is no loop available to be bound to threadIdx/blockIdx.
juda pushed a commit to juda/tvm that referenced this pull request Jun 21, 2022
In TE, a unit loop could be introduced by fusing an empty list of loops on a stage. This PR adds its counterpart in TIR, while being a bit more explicit with a new schedule primitive which adds a unit loop without impacting any existing functionalities.
juda pushed a commit to juda/tvm that referenced this pull request Jun 21, 2022
Following apache#11575, this PR allows CUDA thread binding for TIR programs
like

```python
@T.prim_func
def zero_dim_add(
    A: T.Buffer[(), "float32"],
    B: T.Buffer[(), "float32"],
    C: T.Buffer[(), "float32"],
) -> None:
    with T.block("C"):
        vi = T.axis.spatial(1, 0)
        C[()] = A[()] + B[()]
```

where there is no loop available to be bound to threadIdx/blockIdx.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants