-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Port to Android #1442
Port to Android #1442
Conversation
if result != 0 { | ||
preconditionFailure("execve() failed. errno: \(errno)") | ||
} | ||
_exit(126) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has different semantics to posix_spawn
. posix_spawn
will give you a proper errno
value if the spawn failed, this just exits the child process. I.e. spawning ["/bin/zsh", "-c", "exit 126;"]
will be indistinguishable for the caller, right? It will trigger the preconditionFailure
, no idea what that actually means.
I'd propose to open a separate pipe, called child2parent
or so which will be fcntl
d to FD_CLOEXEC
. That means if execve
succeeds, it will be closed automatically and that means the parent will read EOF. If execve
fails, we just write execve
's errno
to that pipe so the parent can read it and learns what went wrong in the child. So the parent waits until it can read, if it reads EOF (read
returns 0) it knows, the child spawned successfully, if it reads anything else, it knows that this is execve
's errno
value and can return that to the caller. Does that make sense?
Here some pseudo-C code which will hopefully make clear what I mean:
[...]
} else if (pid == 0) {
int err = fcntl(child2parent[1], F_SETFD, FD_CLOEXEC); /* This fd will be closed on exec */
int errno_save = errno;
if (err) {
preconditionFailure("fcntl failed, errno: %d", errno_save);
abort();
}
execve(...);
errno_save = errno;
/* execve returned, this is an error that we need to communicate back to the parent */
ssize_t suc_write = write(child2parent[1], &errno_save, sizeof(typeof(errno)));
errno_save = errno;
if (suc_write > 0 && suc_write < sizeof(typeof(errno)));
/* implement write retry ;) */
} else if (suc_write == 0) {
/* preconditionFailure("EOF?!"); */
abort();
} else if (suc_write < 0) {
preconditionFailure("write failed, errno %d", errno_save);
abort();
}
close(child2parent[1]);
}
[...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@modocache This great comment by @weissi is still valid, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally getting around to this, sorry for the wait. @weissi, could you explain child2parent
in greater detail? Would the child process file descriptors be duped to write to the parent pipe in some way? I think I'm missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't worry about the wait, great work on this PR! So the reason we need child2parent
is to communicate from child to parent.
There's basically two possible outcomes
- everything went fine
- there was an error in the child (for example file to execute not found, no permission or what not)
In both cases, it's crucial that the child can tell the parent. The easiest way on UNIX is to just open a pipe (child2parent
) in the parent before the fork. Because file descriptors are forked as well, we will have a communication channel from the child to the parent after the fork. child2parent[0]
is the reading end, so that's the file descriptor the parent keeps open. And child2parent[1]
is the writing end, that's the fd the child keeps open. In other words, if the child writes something to child2parent[1]
, the parent will receive it on child2parent[0]
. So far so good, now we have a communication facility.
If there was an error execve
ing in the child, we can just write the errno
value that it got into child2parent[1]
and the parent will know what exactly went wrong. For example if the file to be executed wasn't found, a ENOENT
will be transmitted and the parent can fail appropriately.
The seemingly easier case is actually the harder: What to do if there was no error in the child, i.e. the child executed another process. Then execve
succeeds and we replaced the process with some other binary. The file descriptor child2parent[1]
will now still be open so we could put something in there which tells the parent that everything went fine. However, the program we executed probably doesn't know it's supposed to do that. The standard trick to get around this issue is to set the file descriptor child2parent[1]
as FD_CLOEXEC
. If FD_CLOEXEC
is set, the file descriptor is closed by the OS kernel automatically if execve
succeeds. That's perfect for us: If child2parent[1]
is closed without anything ever being written to the pipe, the parent can now learn that execve
has succeeded which is exactly the missing big.
So the logic in the parent will be
- try to read on
parent2child[0]
2a. if read returns0
(meaning EOF), the parent knows everything went fine, the child successfully executed another program
2b. if read returns some bytes, we know that something went wrong and the value that we read is theerrno
of the child that it got when trying to execute the other program. That's awesome as we now know in the parent what went wrong in the child :).
Hope this makes any sense, if not or if there are any questions left, feel free to ask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @weissi, your explanation was incredibly helpful! Thanks for so clearly explaining the problem, it made writing the code much easier. I have a working implementation that I will amend to this commit soon; I'm running the Android test suite from #1714 to confirm it all still works (looks good so far!).
(For those that are interested, there are seven failing tests when the stdlib is compiled for Android. Most are related to assumptions in the test expectations themselves.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I've amended changes as per your comments, @weissi! Again, thanks a ton!
First off, thank you for this awesome work! My personal rule is: if you make sense to be between those two changes (ie. have one but not the other), put them in separate commits. You should obviously order those commits in a way that ensures you don't use features before adding them. For instance, the last commit should be the one adding the new target to |
@@ -0,0 +1,215 @@ | |||
//===----------------------------------------------------------------------===// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't duplicate these files. Can we find some way of reusing the Glibc overlay here, adding some #if
s into its source?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll take another look at stdlib/public/Glibc/CMakeLists.txt
. I think this should be possible. The difficult part is that since we're building for both Linux and Android at once, I'll need to use variables other than CMAKE_SYSTEM_NAME
to determine which modulemap to copy. I'll try working on it some more, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gribozavr I've been trying to combine stdlib/public/Bionic
and stdlib/public/Glibc
.
stdlib/public/Glibc/CMakeLists.txt
uses the host system name to either use a Linux or FreeBSD modulemap. I figure I can do something similar to use either a Linux or Android modulemap--but instead of using the host system name (which is always Linux when cross-compiling from Linux to Android), I produce a modulemap for each requested SDK and architecture:
# Unabbreviated source code here: https://gist.github.com/modocache/08e7aefd6c220d7750ac
foreach(SDK ${SWIFT_SDKS})
foreach(arch ${SWIFT_SDK_${SDK}_ARCHITECTURES})
string(TOLOWER "${SDK}" sdk)
set(output_dir "${SWIFTLIB_DIR}/${sdk}/${arch}/glibc")
# ...
# Configure the modulemap based on the target. Each platform needs to
# reference different headers, based on what's available in their glibc.
set(modulemap_path "${CMAKE_CURRENT_BINARY_DIR}/${sdk}/${arch}/module.map")
if("${SDK}" STREQUAL "FREEBSD")
configure_file(module.freebsd.map.in "${modulemap_path}" @ONLY)
elseif("${SDK}" STREQUAL "ANDROID")
configure_file(module.android.map.in "${modulemap_path}" @ONLY)
else()
configure_file(module.map.in "${modulemap_path}" @ONLY)
endif()
# Create the lib/swift SDK/arch subdirectory and copy the configured
# modulemap there.
add_custom_command_target(unused_var
COMMAND
"${CMAKE_COMMAND}" "-E" "make_directory" "${output_dir}"
COMMAND
"${CMAKE_COMMAND}" "-E" "copy_if_different"
"${modulemap_path}"
"${output_dir}/module.map"
CUSTOM_TARGET_NAME "copy_${sdk}_${arch}_glibc_module"
OUTPUT "${output_dir}/module.map" "${output_dir}"
DEPENDS "${modulemap_path}"
COMMENT "Copying Glibc module to ${output_dir}")
# Install the modulemap. When compiling Glibc, make sure
# we use the correct module map for the current SDK.
swift_install_in_component(stdlib
FILES "${output_dir}/module.map"
DESTINATION "${output_dir}")
add_swift_library(swiftGlibc IS_SDK_OVERLAY
${sources}
FILE_DEPENDS "copy_${sdk}_${arch}_glibc_module" "${output_dir}"
TARGET_SDKS "${SDK}"
SWIFT_COMPILE_FLAGS "-Xcc" "-fmodule-map-file=${modulemap_path}"
INSTALL_IN_COMPONENT stdlib-experimental)
endforeach()
endforeach()
The problem with this approach is that swiftc
appears to expect that a Glibc modulemap exists at the following path:
/path/to/built/swift-linux-x86_64/lib/swift/glibc/module.map
Whereas the above CMake file places the module maps at two different paths, in order to avoid them overwriting one another:
/path/to/built/swift-linux-x86_64/lib/swift/linux/x86_64/glibc/module.map
/path/to/built/swift-linux-x86_64/lib/swift/android/armv7/glibc/module.map
This causes the following failure to occur when running tests that import Glibc
:
<unknown>:0: error: missing required module 'SwiftGlibc'
Placing either one of these modulemaps at the expected path lib/swift/glibc/module.map
gets the tests to pass for that target. That is, copying the Linux modulemap ensures the Linux build succeeds (and Android fails), and copying the Android modulemap causes Android to succeed and Linux to fail.
I'm not sure what the significance of the /path/to/built/swift-linux-x86_64/lib/swift/glibc/module.map
path is, or how to hook up the compiler to use the SDK/arch subdirectory path lib/swift/android/armv7/glibc/module.map
instead.
Any suggestions? I've been banging my head against this for a while, so any help would be greatly appreciated! 🙇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, managed to figure this one out: #1679 -- I'd definitely appreciate feedback on that approach! 🙇
Thanks for the feedback, @1ace! I agree: this work depends on #1396, #1410, #1426, #1395, #1334, which I've issued separately to make them easier to review. Unfortunately, GitHub falls apart with a dependent stack of commits, such as this pull request would end up being. I think you'll find other ports, such as Cygwin #1108, end up being similarly large. The problem is that all of these changes are directly related to Android. Take the Of course, if the review unearths more pieces I can split out into separate pull requests, I'll definitely do so! 👍 |
@@ -8,6 +8,9 @@ | |||
// <rdar://problem/24357067> The compiler never finishes compiling validation-test/stdlib/CollectionType.swift.gyb in -O | |||
// REQUIRES: swift_test_mode_optimize_none | |||
|
|||
// FIXME: This test takes too long to complete on Android. | |||
// UNSUPPORTED: OS=linux-androideabi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way we could split it up? Maybe something like validation-test/stdlib/Slice
?
@modocache @zhuowei @ephemer Awesome job! I've been watching this since possibly day one. 👍 😄 |
program_name)) | ||
return 1 | ||
|
||
return execute_on_device(executable_path, executable_arguments, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our experience, executing tests on device without limiting concurrency will usually oversubscribe the device and will lead to spurious failures (out of memory issues etc.) That's because the number of device cores is typically much smaller than the number of cores on the host where lit is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, very true! One hacky solution would be to limit the number of parallel lit
tests on Android. I'm sure I could find a way to do this in utils/build-script-impl
or test/lit.cfg
. I may even be able to query the number of device cores. Is this close to what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would unnecessarily limit the parallelism for running host (compiler) tests. What I was thinking about is adding a persistent server that would be responsible for device communication, and have %target-run
be a thin client that just asks the server to run a certain command. The server would throttle (or possibly load balance over multiple devices!) as appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be awesome. I assume we may use it to handle iOS/tvOS host tests as well some day?
One thought: should I make two pull requests? One to get Android building, and another to get its tests running and passing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that'd likely make things easier.
@swift-ci Please smoke test |
@modocache There seem to be build issues on Linux x86_64, please take a look when you have a moment. |
Looks like
|
this is really cool, I've never seen a 52 file change commit though, good lord! |
I'm working on addressing the feedback--thanks so much for the review, @gribozavr! In the meantime, if anyone out there wants to help with this pull request, here's the output from the test suite: https://gist.github.com/modocache/48babfa768d495806116. Six tests are failing. Some of those failures, like |
@modocache This is really awesome to see, thank you for working on it. Per the previous comment, if you can split this up into smaller PR's, that will make it easier to review and merge piecemeal. |
@modocache Excellent stuff! Keep those great contributions coming 👍 |
Excellent work, I'm looking forword to use it o.o |
🎉 Keep up the great work @modocache! :D |
Great work! @modocache |
Anticipated |
great |
Amazing!! |
Whoah! Very nice! |
Congrats! |
I'm out? |
Today's Big Thing !!!
|
This is exciting! |
Unsubscribbling, because spam. Can someone close, to pipe comments into, comfortably silent, reaction emojis? Also, fantastic work! |
OMG, I think I must to learn Swift. |
Exciting! |
Good work @modocache |
Congratulations! |
Exciting ! |
66666666666 |
Awesome |
Great news 👍 |
Good job! |
Nice work! |
Excellent! |
👍 Excellent |
Good job! |
6666 |
😎 Nice |
Great work. |
I just want to put my mark on this😎 |
Wow! Fantastic news. I'm excited to play around with this. Congrats to everyone involved. Your work is appreciated. 👏🏼👏🏼 |
It's one very good new for developers, for many companies, they have IOS and Android version for every application, it will reduce the effort. Xamarin is another option, but Linux based system should support swift well. I love Swift. Wish it will become usable for real case soon. |
Good Job !!! |
What's in this pull request?
This adds an Android target for the stdlib. It is also the first example of cross-compiling outside of Darwin: a Linux host machine builds for an Android target.
Relevant mailing list discussions:
The Android variant of Swift may be built and tested using the following
build-script
invocation:Android builds have the following dependencies, as can be seen in the build script invocation:
build.sh
script in that repository.What's worth discussing about this pull request?
Continuous integration: I'd be thrilled to have this merged into the main Swift repository, but I don't want to dump a bunch of code that no one can maintain. I think CI is a very important part of maintaining a healthy build, but:
Do either of those sound like something CI servers could be configured with? Or can someone help me think of alternatives--perhaps building libicu as part of the Swift build, as we do with
ninja
? #1398 includes some pkg-config logic that may be useful here.FIXMEs: There are quite a few spots labeled "FIXME" that I could use some help with. Feedback welcome!
Thanks a ton to @zhuowei, who deserves the vast majority of the credit. I just tweaked a few things. 😊