Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hoist data pointer loads from array allocations #43547

Closed
wants to merge 22 commits into from

Conversation

pchintalapudi
Copy link
Member

When arrays are indexed into, the data pointer is loaded at the indexing site. Typically GVN moves this data pointer load out of loops, but we can do this more eagerly for 2D/3D array allocations and 1D array allocations that do not escape, enabling more of the optimization pipeline to come into play (e.g. loop vectorization).

This PR depends on #43487 for the initial array identification routine as well as the ArrayOpt pass, and therefore #43057 for the escape analysis improvements.

Example:

function f(d1, d2, d3)
           s = 0
           a = zeros(Int, d1, d2, d3)
           for i in eachindex(a)
               s += a[i]
           end
           s
       end

#43487 :
Godbolt: https://godbolt.org/z/9sr36nsvq
Benchmark:

julia> @benchmark f(100,100,100)
BenchmarkTools.Trial: 3347 samples with 1 evaluation.
 Range (min … max):  975.691 μs …   7.981 ms  ┊ GC (min … max):  0.00% …  0.00%
 Time  (median):       1.015 ms               ┊ GC (median):     0.00%
 Time  (mean ± σ):     1.488 ms ± 836.460 μs  ┊ GC (mean ± σ):  10.71% ± 16.41%

  █▆▃                                 ▂▃▁         ▃▃         ▃▃ ▁
  ███▆▃█▄▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▁▁▃▁▁▁▁▃███▅▃▆▇▃▁▁▁▇██▆▅▃▇▇▃▁▁▄██ █
  976 μs        Histogram: log(frequency) by time       3.33 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

PR:
Godbolt: https://godbolt.org/z/8qr8rYv14
Benchmark:

julia> @benchmark f(100,100,100)
BenchmarkTools.Trial: 4292 samples with 1 evaluation.
 Range (min … max):  688.914 μs …   6.848 ms  ┊ GC (min … max):  0.00% … 63.87%
 Time  (median):     723.949 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):     1.159 ms ± 764.938 μs  ┊ GC (mean ± σ):  10.78% ± 17.55%

  █▆▄   ▁                          ▃▂                ▃▂    ▃▃▁  ▁
  ███▆▄▆█▆▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▄▁▄████▅▃▇█▇▄▃▁▄▁▁▃▁▃▇████▅▆████ █
  689 μs        Histogram: log(frequency) by time       2.86 ms <

 Memory estimate: 7.63 MiB, allocs estimate: 2.

@pchintalapudi
Copy link
Member Author

Since my branches are part of a fork, I've created a PR comparing this one and the array length propagation PR immediately before it here: pchintalapudi#3

@pchintalapudi pchintalapudi added domain:arrays [a, r, r, a, y, s] compiler:codegen Generation of LLVM IR and native code performance Must go faster labels Dec 29, 2021
@pchintalapudi pchintalapudi force-pushed the pc/data-hoist branch 5 times, most recently from 487457a to 00a9b49 Compare January 8, 2022 07:04
Copy link
Sponsor Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some unit-test specifically for the new llvm-array-opt pass?

src/ccall.cpp Outdated
static_rt);
if (auto array_alloc = dyn_cast<CallInst>(retval.V)) {
array_alloc->addAttribute(AttributeList::ReturnIndex, Attribute::NoAlias);
array_alloc->setMetadata("allocation.array", MDNode::get(jl_LLVMContext, {}));
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer this MD to have a julia prefix.

@@ -301,3 +301,41 @@ void jl_alloc::runEscapeAnalysis(llvm::Instruction *I, EscapeAnalysisRequiredArg
}
}
}

//FIXME: This doesn't actually work on Windows, as Windows inserts a trampoline
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is outdated now? Should instead explain why metadata is used

src/llvm-array-opt.cpp Outdated Show resolved Hide resolved
src/llvm-array-opt.cpp Show resolved Hide resolved
src/llvm-array-opt.cpp Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:codegen Generation of LLVM IR and native code domain:arrays [a, r, r, a, y, s] performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants