-
Notifications
You must be signed in to change notification settings - Fork 13.7k
Closed
Labels
A-codegenArea: Code generationArea: Code generationO-NVPTXTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
If a library references the syncthreads
intrinsic using the NVPTX LLVM backend, rustc
generates an invalid PTX file. Basic example:
test.rs
:
#![feature(lang_items)]
#![feature(no_core, platform_intrinsics)]
#![crate_type = "lib"]
#![no_core]
#[lang = "copy"]
trait Copy {}
#[lang = "freeze"]
trait Freeze {}
#[lang = "sized"]
trait Sized {}
extern "platform-intrinsic" {
pub fn nvptx_syncthreads();
}
#[no_mangle]
pub unsafe fn foo() {
nvptx_syncthreads();
}
nvptx64-nvidia-cuda.json
:
{
"arch": "nvptx64",
"cpu": "sm_20",
"data-layout": "e-i64:64-v16:16-v32:32-n16:32:64",
"linker": "ptx-linker",
"linker-flavor": "ld",
"linker-is-gnu": true,
"dll-prefix": "",
"dll-suffix": ".ptx",
"dynamic-linking": true,
"llvm-target": "nvptx64-nvidia-cuda",
"max-atomic-width": 0,
"os": "cuda",
"obj-is-bitcode": true,
"panic-strategy": "abort",
"target-endian": "little",
"target-pointer-width": "64",
"target-c-int-width": "32"
}
Command:
rustc --target nvptx64-nvidia-cuda test.rs -O --emit=asm
This produces the following (invalid) PTX assembly file:
//
// Generated by LLVM NVPTX Back-End
//
.version 3.2
.target sm_20
.address_size 64
// .globl foo
.extern .func llvm.cuda.syncthreads
()
;
.visible .func foo()
{
{ // callseq 0, 0
.reg .b32 temp_param_reg;
call.uni
llvm.cuda.syncthreads,
(
);
} // callseq 0
ret;
}
Notice how the file is treating llvm.cuda.syncthreads as an external function? I believe that should be a single instruction (though I'm not sure which instruction).
See also denzp/rust-ptx-linker#19
Metadata
Metadata
Assignees
Labels
A-codegenArea: Code generationArea: Code generationO-NVPTXTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlTarget: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.htmlT-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.