Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the PNaCl/JS targets to the backend. #28355

Merged
merged 1 commit into from Oct 22, 2015

Conversation

@DiamondLovesYou
Copy link
Contributor

commented Sep 11, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 11, 2015

How close does this get the backend to working? Is this all that's needed or are there still portions in the pipeline? The previous patches up until now have been incremental in the sense that they were just general refactorings we'd want to do anyway, but I'd prefer to have "pnacl not working" to "pnacl working" support land all at once rather than piecemeal.

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Sep 11, 2015

As far as Rust trans is concerned, this gets PNaCl working. In reality, pnacl-llvm needs at least two more passes to handle:

  • LLVM's auto vectorizer (PNaCl currently disables this because it will generate illegal (in the PNaCl ABI sense) vector lengths).
  • Illegal vector types (ie not 128bit width (and not [fiu]64))
  • [fiu]64 vector element types.

Previously, I just disabled the auto vectorizor and added wrappers for the [fiu]64 SIMD types in core, but because of the recent SIMD rfcs, it seems I'll have to write a legalization pass anyway. The main vector legalization pass is currently WIP. The second pass to expand [fiu]64 will happen after.

Also note I'd also like to allow users to disable the setjmp/longjmp EH (which isn't free) in the future (via an unstable -C target-options="blah" argument).

write_output_file(cgcx.handler, tm, cpm, llmod, &path, llvm::ObjectFileType);
});
if !cgcx.is_like_pnacl {
with_codegen(tm, llmod, config.no_builtins, |cpm| {

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 13, 2015

Member

Should this logic go into pnacl itself? Basically when you request an object file to be emitted from pnacl it seems like it should naturally understand that it instead just emits bitcode.

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Sep 15, 2015

Author Contributor

Except it's LLVM's target machine that does that. PNaCl/JS (well JS does have one, but it isn't used till post-linking) doesn't have a target machine (though PNaCl does use armv7-none-nacl-gnu for optimizing, using it for codegen would produce an ARM NaCl object file, which isn't compatible with PNaCl).

In fact if you take a look at the toolchain's pnacl-clang, you'll find it's just a python wrapper which invokes clang with arguments like -emit-llvm.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 16, 2015

Member

This may just be me being ignorant of the LLVM fork for pnacl, but if you're already forking LLVM couldn't support for this be added? In terms of interoperating with other compilers it seems like it'd be best to put logic into the pnacl target rather than every compiler using LLVM?

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Sep 17, 2015

Author Contributor

But there doesn't exist any PNaCl LLVM target machine in (pnacl-)LLVM's target registry (ie using the PNaCl LLVM fork doesn't help in any way). So there's no LLVM target to tell LLVM how to write an object file (of any format).

In an ideal world, I'd completely throw out the PNaCl IR simplification passes and restart afresh, and I'd do it with an actual, real LLVM target machine (since LLVM's ISD form and PNaCl's IR are actually pretty similar). Sadly, while LLVM's SelectionDAG does most (if not all) the heavy lifting w.r.t. legalizing IR to a specific target, the rest of the toolchain would still need modifications to know how to handle the new PNaCl object format (ie how to read/link), which I would have to create specifically for the target machine (another possible, but still non-trivial thing to implement). The required toolchain modifications are the reason JS also uses bitcode as its object format, despite having an official target machine in LLVM. Plus the translation from the object format back into LLVM IR for codegenning to the real native target.

While I grant that using bitcode as the object format is annoying at times, it is a lot easier because it integrates the preexisting LLVM gold plugin based linker.

I actually agree with you, but the impl cost of what you propose is pretty high vs what emcc and pnacl-clang currently do (ie -emit-llvm (both are just scripts wrapping the real clang underneath; no such luck for Rust)). Plus, the other toolchains would require breaking changes to conform (ie this logic makes Rust conform with the other toolchains).

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 17, 2015

Member

It sounds like we may want to take a similar strategy to those tools and instead of having the default output type be an object file for insertion into an archive we just have the default output type be bitcode to insert into an archive?

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Sep 17, 2015

Author Contributor

--emit=obj would still be broken. Plus the PNaCl linker expects bitcode as the main linker input (ie main.o). This change does accomplish exactly what you propose and doesn't require special casing archive object output.

)
}

compatible_ifn!("llvm.assume", noop(llvmcompat_assume(i1) -> void), 6);

// These must be redirected to `libm` here so PNaCl's linker will link in
// these functions from `libm`.
compatible_ifn!("llvm.copysign.f32", copysignf(t_f32, t_f32) -> t_f32);

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 13, 2015

Member

Shouldn't this sort of handling of intrinsics happen in the pnacl passes, not the compiler itself?

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Sep 15, 2015

Author Contributor

They can't because the linker might omit linking to the libm versions (PNaCl is completely static, so the linker can make such decisions vs regular platform where libc/libm is shared), so it must be done here.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 16, 2015

Member

I don't think I understand this, if pnacl passes were the one to lower these intrinsics to the libm functions I'm not sure why doing this here or there would affect linking?

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Sep 17, 2015

Author Contributor

The PNaCl passes aren't run until after linking. In other words, pnacl-ld first links all the bitcode together and then runs the pre and post optimization PNaCl IR simplification passes, with the optimizations in the middle if requested. Thus the linker can completely omit libm from the output module, not knowing that the PNaCl IR passes will later create references to symbols defined in it.

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Sep 17, 2015

Member

In pnacl what does it actually mean to "link against" libm? Is there a pre-built pile of bytecode corresponding to libm? Are symbols recognized and redirected to JS functions? It still sounds to me like this is something for the linker to handle, not necessarily us in the frontend.

This comment has been minimized.

Copy link
@kripken

kripken Sep 28, 2015

Also, it won't just work: using __builtin_powi for example and compiling with emcc doesn't error, but still leaves an undefined function llvm_powi_f32, which matches what I found when I look through JSBackend. In order words, LLVM's intrinsics are not handled.

Yes, but that's a bug in Emscripten :) That particular intrinsic (llvm.powi.*) was not implemented. Fixed now in emscripten-core/emscripten-fastcomp@fd10012

These intrinsics are the only math related intrinsics allow by the PNaCl ABI. All other math-y functions must be linked in via libm (or whatever) during the bitcode link.

As I understand it, the PNaCl ABI is a definition of the PNaCl distribution format, i.e., the thing that ships to browsers. It makes sense to limit that, so that PNaCl executables are as portable as possible. However, there isn't a reason I am aware of to not support all LLVM intrinsics in the PNaCl developer-side compiler. If you don't handle llvm.powi.* and all other math intrinsics there, then you need hacks in every single compiler targeting PNaCl.

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Oct 1, 2015

Author Contributor

@kripken

Yes, but that's a bug in Emscripten :) That particular intrinsic (llvm.powi.*) was not implemented. Fixed now in emscripten-core/emscripten-fastcomp@fd10012

Ah, that's fair, but I was speaking generally. The majority of LLVM's math intrinsics are still not handled; only calls to the libm versions are "handled" (and that's just an optimization).

Perhaps there's something I'm missing? Checking with em++ -emit-llvm gives me (using a variation of my previous example, with __builtin_sinf):

; ModuleID = 'test.ll'
target datalayout = "e-p:32:32-i64:64-v128:32:128-n32-S128"
target triple = "asmjs-unknown-emscripten"

@.str = private unnamed_addr constant [7 x i8] c"v == 0\00", align 1
@.str1 = private unnamed_addr constant [7 x i8] c"test.c\00", align 1
@__func__.main = private unnamed_addr constant [5 x i8] c"main\00", align 1

define i32 @main() #0 {
  %1 = alloca i32, align 4
  %v = alloca float, align 4
  store i32 0, i32* %1
  %2 = call float @sinf(float 0x40091EB860000000) #3
  store float %2, float* %v, align 4
  %3 = load float, float* %v, align 4
  %4 = fcmp oeq float %3, 0.000000e+00
  br i1 %4, label %7, label %5

; <label>:5                                       ; preds = %0
  call void @__assert_fail(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i32 0, i32 0), i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str1, i32 0, i32 0), i32 6, i8* getelementptr inbounds ([5 x i8], [5 x i8]* @__func__.main, i32 0, i32 0)) #4
  unreachable
                                                  ; No predecessors!
  br label %7

; <label>:7                                       ; preds = %6, %0
  %8 = phi i1 [ true, %0 ], [ false, %6 ]
  ret i32 0
}

; Function Attrs: nounwind readnone
declare float @sinf(float) #1

; Function Attrs: noreturn
declare void @__assert_fail(i8*, i8*, i32, i8*) #2

attributes #0 = { "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind readnone "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #2 = { noreturn "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #3 = { nounwind readnone }
attributes #4 = { noreturn }

!llvm.ident = !{!0}

!0 = !{!"clang version 3.7.0 "}

So first off, it seems even calling __builtin_sinf doesn't use the LLVM sin.f32 intrinsic.

I went ahead and checked by changing the sinf to llvm.sin.f32 in the above IR and fed it to em++: I got "warning: unresolved symbol: llvm_sin_f32". So JS really doesn't handle all of the math-y intrinsics.

As I understand it, the PNaCl ABI is a definition of the PNaCl distribution format, i.e., the thing that ships to browsers. It makes sense to limit that, so that PNaCl executables are as portable as possible. However, there isn't a reason I am aware of to not support all LLVM intrinsics in the PNaCl developer-side compiler. If you don't handle llvm.powi.* and all other math intrinsics there, then you need hacks in every single compiler targeting PNaCl.

Again, JS codegen doesn't handle the other math intrinsics either; as you've said, it hasn't yet been a problem for the other languages targeting JS, meaning the compilers employed hacks like the one being discussed here or they never used the intrinsics in the first place, opting to always call into libm.

Having said all that, I think I can fix this for both projects. See #28355 (comment).

This comment has been minimized.

Copy link
@kripken

kripken Oct 1, 2015

Yes, in practice, emscripten just supports the intrinsics we've seen emitted from compilers, mostly clang. We should probably add all the others as well, but not all are trivial to implement, so we've just been adding intrinsics when a need appears, i.e. when someone files an issue. As far as I know, there are no outstanding issues on this.

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Oct 1, 2015

Author Contributor

I'd say JS has it easy: it has the Math.* namespace to lean on and the intrinsics don't respect errno anyway. 😄

This comment has been minimized.

Copy link
@kripken

kripken Oct 1, 2015

Heh, true ;)

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 21, 2015

@brson, what are your feelings on this? This looks to be one of the only points of divergence where "if pnacl { ... } else { ... }" is needed, and while it's good that it's contained this basically all still looks to me like portions which would be best handled in LLVM somewhere instead of rustc itself (e.g. as it does for all other existing targets). From what @DiamondLovesYou is saying, however, it sounds like the way pnacl is integrated into LLVM makes this not easy unfortunately.

I continue to be worried about the lack of automation we have for this kind of functionality where "some subset" of intrinsics isn't defined on pnacl, but any future intrinsics we add may or may not be part of this subset and we have no way of verifying one way or another.

@brson

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2015

These patches are much easier to digest.

I still don't understand why PNaCl believes it's ok not to handle intrinsic that the LLVM supports. The explanation that pnacl is static and can't do it isn't convincing - LLVM itself lowers intrinsics to libm calls and PNaCl can do that too. What happens if we do pass these intrinsic calls to PNaCl? Are they on a blacklist that causes it to fail to compile? Why not just pass the inttrinsic through to libm like LLVM?

If we must do our own intrinsic lowering, then let's do it in a generic way - give target specs a mapping from intrinsics to C calls, not have a pnacl special case. Likewise the is_like_pnacl check for emitting llvm bitcode (into a .o file?) should be a generic 'output_bitcode_to_object_file' or something.

@brson

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2015

I guess the issue is that there are no subsequent linkage steps in the pnacl toolchain. The bitcode output by rustc is the final product and there's no opportunity to link libm after the fact.

Edit: but most of these functions are in libm and/or libc, and needed for most software to work. Doesn't PNaCl just always link to libc/libm? (Emscripten supposedly does).

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Sep 28, 2015

@brson Sorry for the delay.

but most of these functions are in libm and/or libc, and needed for most software to work. Doesn't PNaCl just always link to libc/libm? (Emscripten supposedly does).

Sure, since libm is included in libc, but that doesn't mean the linker will actually include the definitions (unless the defs are used), thus their availability isn't guaranteed post link.

Also, PNaCl (and thus Emscripten) disables -simplify-libcalls, so calls to libm won't result in calls to an intrinsic. Thus normal, defined calls to libm (those that would happen in C) will still call libm after optimizations.

If we must do our own intrinsic lowering, then let's do it in a generic way - give target specs a mapping from intrinsics to C calls, not have a pnacl special case. Likewise the is_like_pnacl check for emitting llvm bitcode (into a .o file?) should be a generic 'output_bitcode_to_object_file' or something.

I'm fine with that, if that's what it takes.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Sep 28, 2015

Sure, since libm is included in libc, but that doesn't mean the linker will actually include the definitions (unless the defs are used), thus their availability isn't guaranteed post link.

So if I understand things correctly, there's one giant LLVM module which contains the bitcode for libc, libm, and the application in question. This module is then optimized and such and is passed through the pnacl backend. The definitions of libm functions are all optimized away because they're not used, and then after this happens the intrinsics are lowered to calls to those libm functions, resulting in the undefined symbols?

Is that what's going on?

@kripken

This comment has been minimized.

Copy link

commented Sep 28, 2015

Also, PNaCl (and thus Emscripten) disables -simplify-libcalls

I wasn't aware that PNaCl disables that, and I don't think we use code from PNaCl that would cause Emscripten to do it. But maybe I'm wrong - do you know where PNaCl disables that pass?

LLVM source code seems to show that the entire pass has been removed in upstream anyhow?

http://llvm.org/viewvc/llvm-project?view=revision&revision=184459

Unless PNaCl has modified the places where parts of that pass have been moved to, as mentioned in that commit?

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Sep 30, 2015

@alexcrichton

So if I understand things correctly, there's one giant LLVM module which contains the bitcode for libc, libm, and the application in question. This module is then optimized and such and is passed through the pnacl backend. The definitions of libm functions are all optimized away because they're not used, and then after this happens the intrinsics are lowered to calls to those libm functions, resulting in the undefined symbols?

Is that what's going on?

No, the problem is that LLVM's math intrinsics create implicit dependencies on libm, but the linker isn't aware of this fact, and as a result omits libm (or rather, omits the objects containing no defined undefined symbols) from the resulting module. Then, post-link, when the PNaCl IR passes and optimizations are run, llvm.*.* don't have something that can be used in their place.

Normally this isn't an issue because codegen will rewrite these to libm as needed, thus allowing the linker to know about references to them. However, PNaCl/JS use bitcode linking, which doesn't have calls to llvm.*.* rewritten to libm.

I apologize if I'm making this confusing, as this area seems to have recurring misconceptions.

Thankfully, I think my first proposal isn't as bad as I had initially feared (I'm dumb): generally, the implicit libm dep should just be made explicit, because this would also be an issue for any other target if libc is statically linked for WPO (in the same way it's an issue for PNaCl). After that's fixed, creating a PNaCl IR pass to rewritten the math intrinsics won't be an issue. Once I get time, I'll send a patch to upstream LLVM.

@kripken No the file was just moved to a different folder in that commit. See: https://github.com/llvm-mirror/llvm/blob/master/lib/Transforms/Utils/SimplifyLibCalls.cpp

It probably should be disabled, though it probably only creates issues in extreme corner cases. See this comment from pnacl-opt.py from the NaCl SDK:

  # We disable the LLVM simplify-libcalls pass by default, since we
  # statically link in libc, and pnacl-opt is typically used for post-link
  # optimizations.  Changing one library call to another can lead
  # to undefined symbol errors since which definitions from libc are linked
  # in is already decided.
@kripken

This comment has been minimized.

Copy link

commented Oct 1, 2015

@DiamondLovesYou : thanks, interesting. I looked into this for emscripten, and it looks like it already isn't run during LTO anyhow (and it's fine to run otherwise). We mostly just do -std-link-opts for LTO, and those don't run simplify-libcalls, perhaps for this very reason. I guess PNaCl runs additional optimizations than -std-link-opts during LTO/their final build stage?

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2015

@kripken Yeah, PNaCl runs -On in addition to LTO (for even better optimization opportunities).

W.r.t. the intrinsic/libm thingy: after doing some digging in LLVM, it seems LLVM already ensures libm is available at codegen time by emitting references to libm functions. BUT this is only done if there's a target machine available. Le sigh. I may end up creating a dummy target machine and lowering etc for PNaCl.

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Oct 1, 2015

@kripken In case you're interested: I've created a proposal for a PNaCl target machine here based on the discussion in this PR.

@bors

This comment has been minimized.

Copy link
Contributor

commented Oct 2, 2015

☔️ The latest upstream changes (presumably #28768) made this pull request unmergeable. Please resolve the merge conflicts.

@kripken

This comment has been minimized.

Copy link

commented Oct 2, 2015

@DiamondLovesYou : thanks for the info.

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Oct 5, 2015

@DiamondLovesYou your proposal on the pnacl mailing seems promising! Perhaps this should hold off for a resolution on that?

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Oct 16, 2015

@alexcrichton Sorry, didn't see your reply in my inbox!

There's no need to wait because the requisite changes for pnacl-llvm can/must be done independently of Rust anyway. Not that this PR in its current form should be accepted, merge conflicts notwithstanding. I'll try to make time this weekend to remove the intrinsic and bitcode output logics as well merge with master.

@kripken FYI, I've sent in a patch for the target machine here: https://codereview.chromium.org/1395453003

@DiamondLovesYou DiamondLovesYou force-pushed the DiamondLovesYou:pnacl-librustc-trans branch 2 times, most recently from 4470b17 to 1091e72 Oct 18, 2015

@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Oct 19, 2015

Okay, I've cleaned/removed up the last of the extra logics. This should be ready to merge!

@@ -838,7 +838,9 @@ fn declare_intrinsic(ccx: &CrateContext, key: &str) -> Option<ValueRef> {

ifn!("llvm.trap", fn() -> void);
ifn!("llvm.debugtrap", fn() -> void);
ifn!("llvm.frameaddress", fn(t_i32) -> i8p);
if !ccx.sess().target.target.options.is_like_pnacl {

This comment has been minimized.

Copy link
@alexcrichton

alexcrichton Oct 19, 2015

Member

It looks like this is the only use case of is_like_pnacl, and it also looks like llvm.frameaddress isn't used at all, so perhaps this could just be removed entirely?

This comment has been minimized.

Copy link
@DiamondLovesYou

DiamondLovesYou Oct 22, 2015

Author Contributor

Fine by me; removed.

@DiamondLovesYou DiamondLovesYou force-pushed the DiamondLovesYou:pnacl-librustc-trans branch from 1091e72 to e497d4a Oct 22, 2015

@alexcrichton

This comment has been minimized.

Copy link
Member

commented Oct 22, 2015

@bors: r+ e497d4a

Thanks for the patience @DiamondLovesYou!

bors added a commit that referenced this pull request Oct 22, 2015

@bors

This comment has been minimized.

Copy link
Contributor

commented Oct 22, 2015

⌛️ Testing commit e497d4a with merge e7b2052...

@bors bors merged commit e497d4a into rust-lang:master Oct 22, 2015

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@DiamondLovesYou

This comment has been minimized.

Copy link
Contributor Author

commented Oct 25, 2015

The first vector type legalizer pass PR is sent: https://codereview.chromium.org/1423873002.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.