Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make all toolchains standalone toolchains #780

Closed
DanAlbert opened this issue Aug 28, 2018 · 11 comments

Comments

@DanAlbert
Copy link
Collaborator

commented Aug 28, 2018

This has been discussed several times, but we haven't had a bug filed for this yet. Rather lengthy since I wanted to explain our thinking on some of these decisions since I suspect there will be many questions. Don't hesitate to chime in if something in here isn't clear and speak up if you have specific concerns. I have a prototype of this that passes all of the NDK's tests so it has at least proven feasible, but I'd be very interested in hearing about any specific issues people think need to be addressed that I've missed.


Overview

Properly compiling and linking code for Android is much harder than it needs to be. Compilers must be invoked with a specific set of flags that may change over time, defaults change, and if not using a build system that supports the NDK you need to use a standalone toolchain, which many people don't seem to discover.

Instead, we can hoist most of this logic into the compiler driver (now that we have only one compiler driver to support). The driver can predict the location of the sysroot relative to its own location, and we can install binutils alongside clang so it is automatically picked up as well.

We can also do the same for the libc++ headers and libraries, though since the compiler will always link the STL after all of the user's specified libraries it is not possible to avoid issues like #379 when the dependencies are broken (if the dependencies are all correct everything works fine, but there are a lot of broken prebuilts out there that ndk-build currently has no problems with, though cmake and standalone toolchains do). There is not much that can be done about this without having the compiler driver (or the linker) re-order the user's inputs, which is not something any compiler/linker I know of does.

The various default build/linker flags Android uses ought to also be part of the clang driver. This includes things like -mfloat-abi=softfp for arm32 (this might actually be the case for armv7-linux-androideabi already, need to check). This also includes things like -Wl,--warn-shared-textrel, since code that would emit this warning will not load on Android.

With all of the above taken care of, we (and more importantly, third-party build system maintainers) can remove a bunch of code the is duplicated throughout the various systems. The make_standalone_toolchain.py script becomes obsolete, since invoking $NDK/toolchain/bin/clang++ -target $TARGET will work out of the box. We'll keep the script around since there's no sense in breaking build scripts people have that use this, but it will be more or less a no-op. We should also look in to installing aarch64-linux-android-clang++ and co to the toolchain directory that default to the appropriate target, since autoconf systems mostly are not aware of the need for -target afaik.

Migration

Making these changes does necessitate making some changes to the layout of the NDK. Clang (and GCC, FWIW, not that we use that any more) already had logic to find binutils/sysroots/etc based on its own install directory. All we need to do to take advantage of it is to:

  • Fix Clang to search Android directories for Android rather than Linux.
  • Teach Clang to search target-version-specific library directories.
  • Install things in the proper locations.

Getting Clang to use the various NDK tool/library paths as-is would have been a much more invasive change to Clang and prevented us from sharing as many code paths with the other Clang targets, making it much more fragile and less likely to be accepted (the NDK's layout is very inconsistent currently; just explaining things like "well the arm binutils is in arm-linux-androideabi-4.9, even though it's not GCC 4.9, and not even GCC for that matter, btw the x86 one is just x86-4.9 rather than being triple-prefixed for some reason" seems like it would make these changes difficult to upstream).

As such we have a few options to consider as to how we approach this problem logistically:

  1. Hard break with the past. Move binutils/libraries to the new paths. No build system can use r19 until it adapts to it.

We're not going to do this. It's an option, but the only benefit is that it doesn't increase the size of the NDK, but it does make for a ton of migration pain.

  1. Install two copies of the toolchain. One being the unmodified old layout, the other being the unified toolchain in $NDK/toolchain (or something).

This has the advantage of having the easier migration story. We'll keep both around until we can remove the old one without causing undue pain (in practice this means "until a gradle plugin supporting the new form is available in the stable channel", at which point most other build systems that try to keep up to date have also been fixed). The disadvantage is that it roughly doubles the size of the NDK. I don't have exact numbers right now, but the new layout doesn't need to have an identical libc.a copied into each API level (itself an affordance for old build systems) or duplicate things like make/python. It's still probably a significant size increase.

  1. Install two copies of the toolchain, but binaries in one would be shell/batch scripts that point to the other.

This saves us some space ($NDK/toolchains is about 1GB in r17), but we still need to duplicate platforms and sysroot (about 500MB in r17) as well as libc++ and co (fairly small). The downside is that it will have some rough edges on Windows (our most important host OS, unfortunately). Clang used to require -fuse-ld=gold.exe (with an extension) on Windows, and anyone that has that would need to change their build files to -fuse-ld=gold.cmd (or -fuse-ld=gold, since that works now, either way, the user needs to make a change, not just the build system maintainer). If the new toolchain is instead the source of truth rather than the old binutils location, any scripts that call any of the binaries in that directory (ar and strip being the most likely; the latter would probably break gradle) would need to switch from calling ar.exe and instead call ar.cmd.

There's an additional downside here in that using scripts to save space comes at a build-time cost. On Linux/Darwin the cost of fork is fairly trivial, but CreateProcess on Windows has proven to be a fairly tight bottleneck in other parts of Android builds.

I personally prefer option 2 since it has the fewest adoption speedbumps, but ~50% install size increase compared to option 3 is not insignificant. I don't think it's worth it, but I could be convinced otherwise if people feel strongly about it (though do remember that the increase is temporary; in a few releases it will be back to its pre-r19 size, plus or minus any unrelated changes).

Implementation

  • Upstream changes to clang to allow us to use the built-in logic for finding binutils/sysroots.
  • Pull that new clang in to Android.
  • Update the NDK to use that Clang. Review
  • Install the new toolchain using either method 2 or 3 above. Review
  • Update CMake toolchain file to use the new layout. Review
  • Update the libc++ test runner to use the new layout. Review
  • Update make_standalone_toolchain.py to use the new layout. Review
  • Update ndk-build to use the new layout. Review
  • Write a guide for third-party build system implementers: Review #125.
  • Install triple-prefixed wrappers for clang that automatically select the right target/API level. Review

Lifting more default flags (Android-wide requirements, not best practices, probably) into the driver is being tracked as #812.

@DanAlbert DanAlbert added this to the r19 milestone Aug 28, 2018
@DanAlbert DanAlbert self-assigned this Aug 28, 2018
@Zingam

This comment has been minimized.

Copy link

commented Aug 29, 2018

Why don't you instead have two distributions the first one with the new layout and the second one with the old + a scary message that that is the deprecated way :)

"Update CMake to use the new layout."

Updated CMake in AS too?

@DanAlbert

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 29, 2018

Why don't you instead have two distributions the first one with the new layout and the second one with the old + a scary message that that is the deprecated way :)

An interesting idea, but in practice I think this is roughly equivalent to option 1. Instead of "do you have r18 or r19?", the question for each build system becomes, "do you have r18, r19 type 1, or r19 type 2?" (and then stretch that question out until the transition period closes and we go back to having only one distribution). It wouldn't decrease the disruption faced by users, but it would exacerbate the disruption for build system maintainers. I don't think that is worth saving 1GB of install size.

It also would require us maintaining duplicate copies of the CMake toolchain and ndk-build throughout the transition (ditto for any third-party build system). That means more bugs, as well as more unaddressed bugs.

Updated CMake in AS too?

Different team, but I believe they're working on it. To be clear, what I actually meant was update our CMake toolchain file, not the builtin cmake implementation. That is a different bug that will be dealt with after this one.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 29, 2018

Drafts for most of the above todo items are up (edited the original post). They all hinge on us getting a clang update that isn't available yet though, so can't submit until then.

@jmgao

This comment has been minimized.

Copy link

commented Aug 29, 2018

If the new toolchain is instead the source of truth rather than the old binutils location, any scripts that call any of the binaries in that directory (ar and strip being the most likely; the latter would probably break gradle) would need to switch from calling ar.exe and instead call ar.cmd.

Why can't we just make stub binaries instead of stub batch scripts? We can have a dozen copies of the same executable that just executes the correct one. There's three architectures to support, but mac and linux are the same, and windows is not that different. It'd be like 5 lines of C++, so it's not like it'd be that bad even if they were completely alien.

@DanAlbert

This comment has been minimized.

Copy link
Collaborator Author

commented Aug 29, 2018

That was something I was considering too. It saves us some space at the cost of additional overhead from CreateProcess on Windows (MSDN says that their exec still uses CreateProcess under the covers). Might be worth it though.

@jmgao

This comment has been minimized.

Copy link

commented Aug 29, 2018

Yeah, there's also a tiny bit of extra complication compared to linux/mac, since we probably also need to wait for our child to finish or else our parent might return early, but that's a whole extra line of code.

@rprichard

This comment has been minimized.

Copy link
Collaborator

commented Aug 29, 2018

The MSVCRT/Mingw emulation of exec isn't very good on Windows:

Shell scripts seem fine on Unix. I think most of the trouble from stubs is on Windows, right?

Issues with batch files I can think of:

  • To run a program with CreateProcess, the .exe or .com extension can be omitted, but the .bat or .cmd extensions are required. The bat/cmd extension can be omitted when using cmd.exe/system() or with exec.
  • Detecting the existence of a program always requires the extension, even with exe.
  • unusual argument parsing (cmd has its own special characters like ^ and %)
  • Command-line length limit of 8KB. CreateProcess can do 32KB.
  • Ctrl-C brings up the "Terminate batch job (Y/N)?" prompt.

FWIW: A strip batch file does break Gradle:

Execution failed for task ':app:transformNativeLibsWithStripDebugSymbolForDebug'.
> A problem occurred starting process 'command 'C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip''

procmon:

3:18:01.9647839 PM  java.exe    10992   IRP_MJ_DIRECTORY_CONTROL    C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  NO SUCH FILE    Type: QueryDirectory, Filter: arm-linux-androideabi-strip
3:18:01.9650473 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  FAST IO DISALLOWED  
3:18:01.9653132 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9738744 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  FAST IO DISALLOWED  
3:18:01.9740380 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9903201 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  FAST IO DISALLOWED  
3:18:01.9904707 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9906639 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip.exe  FAST IO DISALLOWED  
3:18:01.9908409 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip.exe  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9910102 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded FAST IO DISALLOWED  
3:18:01.9911661 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9913875 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded.exe FAST IO DISALLOWED  
3:18:01.9915391 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded.exe NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9917387 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o  FAST IO DISALLOWED  
3:18:01.9918808 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9920487 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o.exe  FAST IO DISALLOWED  
3:18:01.9921819 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o.exe  NAME NOT FOUND  Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9923075 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o C:\src\StripTest\app\build\intermediates\transforms\stripDebugSymbol\debug\0\lib\armeabi-v7a\libnative-lib.so    FAST IO DISALLOWED  
3:18:01.9923917 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o C:\src\StripTest\app\build\intermediates\transforms\stripDebugSymbol\debug\0\lib\armeabi-v7a\libnative-lib.so    NAME INVALID    Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a
3:18:01.9925695 PM  java.exe    10992   FASTIO_NETWORK_QUERY_OPEN   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o C:\src\StripTest\app\build\intermediates\transforms\stripDebugSymbol\debug\0\lib\armeabi-v7a\libnative-lib.so.exe    FAST IO DISALLOWED  
3:18:01.9926626 PM  java.exe    10992   IRP_MJ_CREATE   C:\src\ndk-hacked\toolchains\arm-linux-androideabi-4.9\prebuilt\windows-x86_64\bin\arm-linux-androideabi-strip --strip-unneeded -o C:\src\StripTest\app\build\intermediates\transforms\stripDebugSymbol\debug\0\lib\armeabi-v7a\libnative-lib.so.exe    NAME INVALID    Desired Access: Read Attributes, Disposition: Open, Options: Open Reparse Point, Attributes: n/a, ShareMode: Read, Write, Delete, AllocationSize: n/a

I suspect -fuse-ld=lld won't invoke ld.lld.cmd, but a clang stub would run the real clang.exe, so it would probably find ld.lld.exe anyway, I think?

@jmgao

This comment has been minimized.

Copy link

commented Aug 30, 2018

This saves us some space ($NDK/toolchains is about 1GB in r17), but we still need to duplicate platforms and sysroot (about 500MB in r17) as well as libc++ and co (fairly small).

I think the binaries could be linker scripts pointing to the right place, and the headers are probably small enough to not matter. (or they could be "#include "../../../../", if the size of the headers is actually a problem)

halx99 pushed a commit to halx99/android-ndk that referenced this issue Sep 14, 2018
Bug: android/ndk#780
Test: N/A
Change-Id: I9730c2b74183910de456de375137b86f161a24b9
halx99 pushed a commit to halx99/android-ndk that referenced this issue Sep 14, 2018
Install a copy (we can't remove the old paths until gradle has adapted
to the new layout) of the toolchain/sysroot/etc to $NDK/toolchain such
that it can be invoked without the build system needing to juggle
--sysroot/-gcc-toolchain/etc.

This follows the second approach described on
android/ndk#780. That is it installs a
full copy of the toolchain rather than installing scripts to refer to
the old toolchain during the transition period.

Test: ./checkbuild.py && ./run_tests.py
Bug: android/ndk#780
Change-Id: I8d8f10ea6c95378937ac54e440c5e1a9cb3bbeaa
@DanAlbert

This comment has been minimized.

Copy link
Collaborator Author

commented Oct 10, 2018

I've filed #812 to track enumerating and lifting more required flags into the Clang driver, but the parts of this feature that will be included in r19 are done.

@wang-bin

This comment has been minimized.

Copy link

commented Nov 8, 2018

What about symbolic links for windows?

@DanAlbert

This comment has been minimized.

Copy link
Collaborator Author

commented Nov 8, 2018

Nothing to be done about it in the short term since the SDK manager doesn't handle symlinks even on Linux.

yan12125 added a commit to yan12125/python3-android that referenced this issue Dec 8, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.