Skip to content

Conversation

@wsmoses
Copy link
Member

@wsmoses wsmoses commented Feb 7, 2025

@gbaraldi vendoring this in interim to unblock kernels

Comment on lines +415 to +418
# From https://github.com/JuliaGPU/GPUCompiler.jl/blob/7b9322faa34685026c4601a5084eecf5a5d7f3fe/src/ptx.jl#L149
function vendored_optimize_module!(@nospecialize(job),
mod::LLVM.Module,
instcombine::Bool=false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# From https://github.com/JuliaGPU/GPUCompiler.jl/blob/7b9322faa34685026c4601a5084eecf5a5d7f3fe/src/ptx.jl#L149
function vendored_optimize_module!(@nospecialize(job),
mod::LLVM.Module,
instcombine::Bool=false
function vendored_optimize_module!(
@nospecialize(job), mod::LLVM.Module, instcombine::Bool=false
)

instcombine::Bool=false
)
tm = GPUCompiler.llvm_machine(job.config.target)
# TODO: Use the registered target passes (JuliaGPU/GPUCompiler.jl#450)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
# TODO: Use the registered target passes (JuliaGPU/GPUCompiler.jl#450)
LLVM.@dispose pb = LLVM.NewPMPassBuilder() begin

Comment on lines 423 to 427
LLVM.register!(pb, LLVM.NVVMReflectPass())

LLVM.add!(pb, LLVM.NewPMFunctionPassManager()) do fpm
# TODO: need to run this earlier; optimize_module! is called after addOptimizationPasses!
LLVM.add!(fpm, LLVM.NVVMReflectPass())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LLVM.register!(pb, LLVM.NVVMReflectPass())
LLVM.add!(pb, LLVM.NewPMFunctionPassManager()) do fpm
# TODO: need to run this earlier; optimize_module! is called after addOptimizationPasses!
LLVM.add!(fpm, LLVM.NVVMReflectPass())
LLVM.register!(pb, GPUCompiler.NVVMReflectPass())
LLVM.add!(pb, LLVM.NewPMFunctionPassManager()) do fpm
# TODO: need to run this earlier; optimize_module! is called after addOptimizationPasses!
LLVM.add!(fpm, GPUCompiler.NVVMReflectPass())

LLVM.add!(fpm, LLVM.InstSimplifyPass()) # clean-up redundancy
end
LLVM.add!(fpm, LLVM.NewPMLoopPassManager(; use_memory_ssa=true)) do lpm
LLVM.add!(lpm, LLVM.LICMPass()) # the inner runtime check might be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[JuliaFormatter] reported by reviewdog 🐶

Suggested change
LLVM.add!(lpm, LLVM.LICMPass()) # the inner runtime check might be
# outer loop invariant

@wsmoses wsmoses merged commit 0794edf into main Feb 8, 2025
33 of 37 checks passed
@wsmoses wsmoses deleted the vendopt branch February 8, 2025 00:59
wsmoses added a commit that referenced this pull request Feb 8, 2025
* vendor optimize

* Update ReactantCUDAExt.jl

* Update ext/ReactantCUDAExt.jl

* Update ext/ReactantCUDAExt.jl

* try forcing random seed for basic test

---------

Co-authored-by: Mosè Giordano <765740+giordano@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants