Add options for cross compilation #164

ivanradanov · 2022-02-03T02:20:53Z

These options enabled me to cross compile an aarch64 object file which I was then able to link and run natively on aarch64.

wsmoses

lgtm, add a test that the llvm output is the correct arch?

ivanradanov · 2022-02-03T03:00:11Z

It actually never seems to be the correct arch when compiling a .cu file, at least on my end, a bug perhaps?

(cmd)$ cat test-no-kernel.cu; /scr0/ivan/src/Polygeist/build//bin/mlir-clang --function=* --cuda-lower --cpuify="distribute" -resource-dir=/scr0/ivan/src/Polygeist/mlir-build//lib/clang/14.0.0/ --cuda-gpu-arch=sm_60 --cuda-path=/opt/cuda-10.2/ -c test-no-kernel.cu -o test.ll -emit-llvm -S; cat test.ll

int main() {
        return 0;
}
; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"
target datalayout = "e-i64:64-i128:128-v16:16-v32:32-n16:32:64"
target triple = "nvptx64-nvidia-cuda"

declare i8* @malloc(i64)

declare void @free(i8*)

define i32 @main() !dbg !3 {
  ret i32 0
}

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2}

!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "mlir", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!1 = !DIFile(filename: "LLVMDialectModule", directory: "/")
!2 = !{i32 2, !"Debug Info Version", i32 3}
!3 = distinct !DISubprogram(name: "main", linkageName: "main", scope: null, file: !4, line: 2, type: !5, scopeLine: 2, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
!4 = !DIFile(filename: "test-no-kernel.cu", directory: "/home/ivan/src/rodinia/cuda/bfs")
!5 = !DISubroutineType(types: !6)
!6 = !{}

The generated module is with target triple = "nvptx64-nvidia-cuda"
Or is this intended?

When the file is .cpp it works as expected:

(ins)$ cat test.cpp; /scr0/ivan/src/Polygeist/build//bin/mlir-clang --function=* --cuda-lower --cpuify="distribute" -resource-dir=/scr0/ivan/src/Polygeist/mlir-build//lib/clang/14.0.0/ --cuda-gpu-arch=sm_60 --cuda-path=/opt/cuda-10.2/ -c test.cpp -o test.ll -emit-llvm -S -target aarch64-unknown-linux-gnu -mcpu=a64fx; cat test.ll

int main() {
        return 0;
}
warning: argument unused during compilation: '--cuda-gpu-arch=sm_60'
; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"
target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
target triple = "aarch64-unknown-linux-gnu"

declare i8* @malloc(i64)

declare void @free(i8*)

define i32 @main() !dbg !3 {
  ret i32 0
}

(ins)$ cat test.cpp; /scr0/ivan/src/Polygeist/build//bin/mlir-clang --function=* --cuda-lower --cpuify="distribute" -resource-dir=/scr0/ivan/src/Polygeist/mlir-build//lib/clang/14.0.0/ --cuda-gpu-arch=sm_60 --cuda-path=/opt/cuda-10.2/ -c test.cpp -o test.ll -emit-llvm -S; cat test.ll

int main() {
        return 0;
}
warning: argument unused during compilation: '--cuda-gpu-arch=sm_60'
; ModuleID = 'LLVMDialectModule'
source_filename = "LLVMDialectModule"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare i8* @malloc(i64)

declare void @free(i8*)

define i32 @main() !dbg !3 {
  ret i32 0
}

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2}

!0 = distinct !DICompileUnit(language: DW_LANG_C, file: !1, producer: "mlir", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
!1 = !DIFile(filename: "LLVMDialectModule", directory: "/")
!2 = !{i32 2, !"Debug Info Version", i32 3}
!3 = distinct !DISubprogram(name: "main", linkageName: "main", scope: null, file: !4, line: 2, type: !5, scopeLine: 2, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !6)
!4 = !DIFile(filename: "test.cpp", directory: "/home/ivan/src/rodinia/cuda/bfs")
!5 = !DISubroutineType(types: !6)
!6 = !{}

wsmoses · 2022-02-03T03:59:05Z

So part of that is that when compiling modules of two types it uses the one that was compiled first. I'd probably override it if it is set (and in the cuda lower case drop that after merged).

ivanradanov · 2022-02-08T05:11:57Z

I made it so that the triple and data layout do not get overridden when the compilation job target is nvptx* which is when the device code is compiled, unless that is the only type of compilation job we have. Does this work?

wsmoses · 2022-02-08T05:16:54Z

Seems reasonable to me

ivanradanov added 2 commits February 2, 2022 17:09

Add cross-compilation options triple and mcpu

0525a4e

Add sysroot option

7787c9d

ivanradanov requested a review from wsmoses February 3, 2022 02:20

wsmoses approved these changes Feb 3, 2022

View reviewed changes

Do not override target options with cuda device ones (nvptx)

4d96c5e

wsmoses merged commit dce289f into llvm:main Feb 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add options for cross compilation #164

Add options for cross compilation #164

Uh oh!

ivanradanov commented Feb 3, 2022

Uh oh!

wsmoses left a comment

Uh oh!

ivanradanov commented Feb 3, 2022

Uh oh!

wsmoses commented Feb 3, 2022

Uh oh!

ivanradanov commented Feb 8, 2022 •

edited

Loading

Uh oh!

wsmoses commented Feb 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add options for cross compilation #164

Add options for cross compilation #164

Uh oh!

Conversation

ivanradanov commented Feb 3, 2022

Uh oh!

wsmoses left a comment

Choose a reason for hiding this comment

Uh oh!

ivanradanov commented Feb 3, 2022

Uh oh!

wsmoses commented Feb 3, 2022

Uh oh!

ivanradanov commented Feb 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wsmoses commented Feb 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ivanradanov commented Feb 8, 2022 •

edited

Loading