Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt] Add more accurate alias analysis for ExternalPtrStmt #3859

Merged
merged 2 commits into from
Dec 23, 2021

Conversation

strongoier
Copy link
Contributor

This PR is a follow-up of #2952. Making alias analysis more accurate can unblock future optimizations on ExternalPtrStmt (many of current optimization passes only focus on GlobalPtrStmt).

That said, we can already observe some benefits of this PR. For example, regarding the following code snippet,

@ti.kernel
def test(a: ti.any_arr(element_dim=1)):
    for i in range(10):
        a[i][0] = i
        a[i][1] = i * 2
        a[i][2] = a[i][0]

a = np.zeros((10, 3))
test(a)

Before this PR, the final CHI IR is,

kernel {
  $0 = offloaded range_for(0, 10) grid_dim=0 block_dim=32 
  body {
    <i32> $1 = loop $0 index 0
    <*f64> $2 = arg[0]
    <i32> $3 = const [0]
    <*f64> $4 = external_ptr <$2>, [$1, $3]
    <f64> $5 = cast_value<f64> $1
    $6 : global store [$4 <- $5]
    <i32> $7 = const [1]
    <i32> $8 = bit_shl $1 $7
    <*f64> $9 = external_ptr <$2>, [$1, $7]
    <f64> $10 = cast_value<f64> $8
    $11 : global store [$9 <- $10]
    <f64> $12 = global load $4
    <i32> $13 = const [2]
    <*f64> $14 = external_ptr <$2>, [$1, $13]
    $15 : global store [$14 <- $12]
  }
}

After this PR, the final CHI IR is,

kernel {
  $0 = offloaded range_for(0, 10) grid_dim=0 block_dim=32 
  body {
    <i32> $1 = loop $0 index 0
    <*f64> $2 = arg[0]
    <i32> $3 = const [0]
    <*f64> $4 = external_ptr <$2>, [$1, $3]
    <f64> $5 = cast_value<f64> $1
    $6 : global store [$4 <- $5]
    <i32> $7 = const [1]
    <i32> $8 = bit_shl $1 $7
    <*f64> $9 = external_ptr <$2>, [$1, $7]
    <f64> $10 = cast_value<f64> $8
    $11 : global store [$9 <- $10]
    <i32> $12 = const [2]
    <*f64> $13 = external_ptr <$2>, [$1, $12]
    $14 : global store [$13 <- $5]
  }
}

We can see that a redundant global load can now be eliminated.

@netlify
Copy link

netlify bot commented Dec 23, 2021

✔️ Deploy Preview for jovial-fermat-aa59dc canceled.

🔨 Explore the source changes: b69a4c9

🔍 Inspect the deploy log: https://app.netlify.com/sites/jovial-fermat-aa59dc/deploys/61c4043a69cac20007f33132

Copy link
Contributor

@ailzhang ailzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!! This might help ndarray perf as well!

@strongoier strongoier merged commit 458c31c into taichi-dev:master Dec 23, 2021
@strongoier strongoier deleted the alias-external-ptr branch December 23, 2021 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants