Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARM ABI #14194

Merged
merged 3 commits into from
Jan 18, 2016
Merged

ARM ABI #14194

merged 3 commits into from
Jan 18, 2016

Conversation

maleadt
Copy link
Member

@maleadt maleadt commented Nov 30, 2015

Hi all,

I wanted to run Julia on my Raspberry Pi 2, so I have taken a stab at implementing an ARM ABI interface for Julia. This fixes the following issues: #13508, #12355, #12347, #12343, and partially fixes #12344 (the parallel test suite doesn't pass yet, but the segfaults are fixed).

Although I'm happy with how most of Julia now works on my ARM-based computer, I don't think that implementing ABI's in this ad-hoc fashion is really future-proof. I'm sure there's missing features (not caught by the current test suite), and I'm not entirely convinced by the current ABI interface (needPassByRef, use_sret, preferred_llvm_type etc). Maybe we should look into using llvm-abi (announcement thread), a library for generating ABI-compliant LLVM IR?

Some questions/uncertainties:

  • I've added checks on EABI (AAPCS) and VFP procedure calling, currently #erroring out if not found. I'm not sure if this is appropriate, but the ABI currently relies on this.
  • I'm currently casting heterogeneous aggregates to integer arrays for alignment purposes, because this is what Clang does. On x86 however, Clang uses the align attribute. I don't think this is easy to add without messing with ccall, but it might be a neater way of aligning aggregates.

@tkelman tkelman added the system:arm ARMv7 and AArch64 label Nov 30, 2015
// TODO: If the argument is an integral Fundamental Data Type that is
// smaller than a word, then it is zero- or sign-extended to a full word and
// its size is set to 4 bytes. If the argument is a Half-precision Floating
// Point Type it is converted to Single Precision.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can return T_int32 and the callee of this function will zero/sign extend to accommodate

@vtjnash
Copy link
Member

vtjnash commented Dec 4, 2015

Maybe we should look into using llvm-abi (announcement thread), a library for generating ABI-compliant LLVM IR?

i had seen that announcement thread, but that library doesn't actually have a very good or complete ABI implementation and yet contains considerably more code.

On x86 however, Clang uses the align attribute

afaics, this is only meaningful when applied to byval arguments. if not specified, it defaults to the platform-default value, which should be perfectly suitable for our requirements.

I'm currently casting heterogeneous aggregates to integer arrays for alignment purposes, because this is what Clang does.

yep, it looks like clang does the following:

define void @test_4(%struct.struct4* noalias sret %agg.result, [3 x i32] %a.coerce) #1 {
  %a = alloca %struct.struct4, align 4
  %1 = bitcast %struct.struct4* %a to [3 x i32]*
  store [3 x i32] %a.coerce, [3 x i32]* %1, align 1

@maleadt
Copy link
Member Author

maleadt commented Dec 7, 2015

Thanks for the feedback, @vtjnash, I'll be looking at it. Seems like there's some other places I'm dropping the ball (not generating srets where Clang does, and seemingly lucking out). I'm going to have a look at some improved testing, maybe by direct comparison against Clang's function signatures instead of actually executing code and hoping it crashes.

@maleadt maleadt closed this Dec 8, 2015
@maleadt maleadt deleted the arm_abi branch December 8, 2015 12:58
@ViralBShah
Copy link
Member

Would have been a good idea to keep this open, and force push to this branch. That way the discussion around the ARM ABI could have continued here. Anyways, it can always a new PR.

@maleadt
Copy link
Member Author

maleadt commented Dec 8, 2015

Huh, the delete & close was actually an unintended artifact of a git push --mirror from a host which didn't have the arm_abi branch... Let's see if I can fix this up.

EDIT: done. I'll try not to delete this branch in the future :-)

@maleadt maleadt reopened this Dec 8, 2015
@ViralBShah
Copy link
Member

Any updates on this one to get it to a state where we can merge it? I have been salivating on this PR for a few days and hoping we can even backport it as well.

@JeffBezanson
Copy link
Member

+1 This is an awesome piece of work. Would be great to bring it to conclusion.

@maleadt
Copy link
Member Author

maleadt commented Dec 29, 2015

Sorry, haven't had the time to work on this, and I won't for the next few days either. I'll look at it next week, but if anyone is impatient enough to pick this up, be my guest :-)

@maleadt maleadt changed the title ARM ABI RFC: ARM ABI Jan 11, 2016
@maleadt
Copy link
Member Author

maleadt commented Jan 11, 2016

As promised, here is a reworked version of this PR. The current version is much more thorough, follows the docs better, and aligns with @yuyichao's recent AArch64 work.

It successfully passes my extended version of the ccall test suite (see PR #14215), except the Int128 test passing an Int128 to a C function accepting a struct { int64_t, int64_t }. I'm not sure how to make sense of that.

@vtjnash isHomogeneous currently still works on Julia types, will fix tomorrow. I just wanted to get some eyes on this already.

//===----------------------------------------------------------------------===//

// LIMITATIONS
// - doesn't respect register usage, these should be encoded in the ABI state
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is handled by llvm and it should handle it well as long as we emit the correct lowered type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, I had thought the split register/stack mode would need to be generated by the front-end but it seems part of ARMCallingConv.td.

@yuyichao
Copy link
Contributor

It successfully passes my extended version of the ccall test suite (see PR #14215), except the Int128 test passing an Int128 to a C function accepting a struct { int64_t, int64_t }. I'm not sure how to make sense of that.

I haven't fully read the AArch32 calling convension doc yet but I do remember __int128 is not a machine type on it (on the other hand it is on AArch64) so the frontend (us) shouldn't emit such types for calling c functions. (i.e. you should probably translate i128 into [2 x i64], this is what I do for bitstypes of invalid (no C correspondence) sizes for AArch64).

That test is technically wrong for AArch64 since struct {int64_t, int64_t} and __int128 has different alignment requirement. It is fine for now though since there's no practical differences when passing as the first/only parameter.

@maleadt
Copy link
Member Author

maleadt commented Jan 12, 2016

Decomposing the Int128 into [2 x i64] turned out slightly more complex, as in this case it needed to be sret too and Julia didn't honour the ABI's preferred type in case of sret.

@maleadt
Copy link
Member Author

maleadt commented Jan 12, 2016

This should be pretty feature complete now. I guess the homogeneous aggregate detection could be shared between ABI's, like Clang does it, but I don't really feel like tackling that right now.

@ViralBShah
Copy link
Member

Do the commits need squashing? While @vtjnash should review and merge the PR, I am guessing that tests are not necessary here since the failing tests that this PR will fix are sufficient to catch future regressions.

@maleadt
Copy link
Member Author

maleadt commented Jan 13, 2016

Yeah, those tests + those from #14215 should cover the ABI sufficiently. Maybe there should be a CI bot for ARM too? Anyway, I'll squash/rebase/remove the RFC tag tomorrow morning.

@ViralBShah
Copy link
Member

On Travis, one has to compile with a cross compiling toolchain and then test in qemu, which is probably really slow. There is a plan to at least set up buildbots that can run every night on a scaleway machine. @staticfloat almost has this going, but currently giving armv7 as the option to -march is failing.


void needPassByRef(AbiState *state,jl_value_t *ty, bool *byRef, bool *inReg)
{
// B.1 (TODO)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is irrelevant. It doesn't really exist in C and it roughly correspond to our Any which is indeed passing by pointer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.


base = NULL;
for (size_t i = 0; i < members; ++i) {
Type *T = isLegalHAType((jl_datatype_t*)jl_field_type(dt,i));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I believe the members counting is wrong for recursive types.

@maleadt
Copy link
Member Author

maleadt commented Jan 18, 2016

@yuyichao I addressed your comments.

Other pending reviews? If not, is this good to go?

@yuyichao
Copy link
Contributor

LGTM =)

@vtjnash Any comments? If not I'll merge later today.

yuyichao added a commit that referenced this pull request Jan 18, 2016
@yuyichao yuyichao merged commit 3f6161a into JuliaLang:master Jan 18, 2016
@ViralBShah
Copy link
Member

Bumping again to see if we can backport to 0.4. Looks like it shouldn't be a difficult backport and fairly self-contained.

@ViralBShah
Copy link
Member

@tkelman As you had mentioned to me, the codegen restructure makes it difficult to have a clean backport to 0.4. Perhaps ARM should just wait for 0.5.

@tkelman
Copy link
Contributor

tkelman commented Jan 25, 2016

I thought I phrased it the other way around - anything that depends on the codegen rewrite will obviously be tough to backport on its own, but most of the currently backport-tagged arm-related PR's look more or less standalone? Let's see what ends up being easy to backport for 0.4.4 and how many arm bugs we can fix there without spending too much effort.

@ViralBShah
Copy link
Member

I actually have backported pretty much all of them. There are compile errors in the arm ABI file, because they refer to things that were not in 0.4.

@tkelman
Copy link
Contributor

tkelman commented Jan 25, 2016

You introduced a bunch of CI failures. Could you have gone through a PR?

@ViralBShah
Copy link
Member

I never pushed I thought.

@ViralBShah
Copy link
Member

At least I didn't mean to, but might have done it out of habit, when on a branch - except this was the release branch. Can we get rid of them right away?

@tkelman
Copy link
Contributor

tkelman commented Jan 25, 2016

Revert the commits that are broken. The release branch is protected so force pushing is disabled. Does release-0.4 compile on arm with llvm 3.7.1? If the new ABI doesn't compile on ARM without the codegen rewrite then that's a regression w.r.t. 0.4.3 so that should probably also be reverted.

@ViralBShah
Copy link
Member

I have reverted locally the arm abi patches. I think the rest should be ok, and this will give a fully functioning build, with llvm 3.7.1. Once I do that successfully, I can push the reverts.

@tkelman
Copy link
Contributor

tkelman commented Jan 25, 2016

See my commit comments. d93b819 and dfba26b aren't correct as-is for backporting. And e8c44f6 needs to be adjusted.

edit: should really be testing from a clean build here, since you're picking up uncommitted files

tkelman pushed a commit that referenced this pull request Mar 27, 2016
(cherry picked from commit 0eae5e5)
ref #14194
tkelman pushed a commit that referenced this pull request Mar 27, 2016
(cherry picked from commit 0eae5e5)
ref #14194
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:arm ARMv7 and AArch64
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parallel test crashing on arm (perhaps when there is a firewall)
6 participants