Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Polish Bazel build scripts #6424

Merged
merged 6 commits into from Dec 17, 2019
Merged

Polish Bazel build scripts #6424

merged 6 commits into from Dec 17, 2019

Conversation

mehrdadn
Copy link
Contributor

@mehrdadn mehrdadn commented Dec 11, 2019

Why are these changes needed?

Related issue number

#6185

Checks

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19416/
Test FAILed.

Copy link
Contributor

@raulchen raulchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I wasn't aware of these built-in conditions.

@mehrdadn
Copy link
Contributor Author

No worries! There might be some more changes coming though -- I realized after I made this PR that I need to add some copts and such for Windows compatibility. I'll update the PR when I get it working.

BUILD.bazel Outdated
"//conditions:default": [
"-Wl,--version-script,$(location //:src/ray/ray_version_script.lds)",
],
}),
},
copts = COPTS + if_linux_x86_64(["-fno-gnu-unique"]),
copts = COPTS + select({
"@bazel_tools//src/conditions:linux_x86_64": ["-fno-gnu-unique"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing the following error on macOS. I think you need to add a default condition here.

/Users/travis/build/ray-project/ray/BUILD.bazel:785:1: Configurable attribute "copts" doesn't match this configuration (would a default condition help?).
Conditions checked:
 @bazel_tools//src/conditions:linux_x86_64

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup I do have that fix locally! I haven't pushed it to avoid clogging the builds for now. Will hopefully push it when I fix some other things as well.

@mehrdadn
Copy link
Contributor Author

@raulchen This is really unfortunate... I've I just started getting a ton of linker errors on Windows as a result of the previous PR, and tracking them down is proving to be somewhat of a nightmare. To give some examples, here's a sample of what I'm seeing:

lld-link: error: undefined symbol: public: __cdecl ray::RayLog::RayLog(char const *, int, enum ray::RayLogLevel)
>>> referenced by C:\users\home\_bazel_home\lkvlq2zo\execroot\com_github_ray_project_ray\src\ray\common\buffer.h:57
>>>               bazel-out/x64_windows-fastbuild/bin/_objs/python/ray/streaming/_streaming.so/_streaming.obj:(public: __cdecl ray::LocalMemoryBuffer::LocalMemoryBuffer(unsigned char *, unsigned __int64, bool))

lld-link: error: undefined symbol: public: virtual __cdecl ray::RayLog::~RayLog(void)
>>> referenced by C:\users\home\_bazel_home\lkvlq2zo\execroot\com_github_ray_project_ray\src\ray\common\buffer.h:57
>>>               bazel-out/x64_windows-fastbuild/bin/_objs/python/ray/streaming/_streaming.so/_streaming.obj:(public: __cdecl ray::LocalMemoryBuffer::LocalMemoryBuffer(unsigned char *, unsigned __int64, bool))

lld-link: error: undefined symbol: unsigned __int64 __cdecl ray::MurmurHash64A(void const *, int, unsigned int)
>>> referenced by C:\users\home\_bazel_home\lkvlq2zo\execroot\com_github_ray_project_ray\src\ray\common\id.h:443
>>>               streaming_lib.lib(data_reader.obj):(public: unsigned __int64 __cdecl ray::BaseID<class ray::ObjectID>::Hash(void) const)
>>> referenced by streaming_lib.lib(data_writer.obj)
>>> referenced by streaming_queue.lib(queue_handler.obj)
>>> referenced by streaming_lib.lib(channel.obj)

lld-link: error: undefined symbol: __imp_InterlockedExchangeAdd
>>> referenced by C:\users\home\_bazel_home\lkvlq2zo\execroot\com_github_ray_project_ray\external\boost\boost\thread\win32\basic_timed_mutex.hpp:244
>>>               thread.lib(thread.obj):(public: void __cdecl boost::detail::basic_timed_mutex::unlock(void))

It seems to me that a lot of these are due to the new use of shared libraries, which we had tried to avoid until this point—getting them to work seamlessly seems to be really painful. In fact, I don't yet even know how to handle them at all! But I'm unsure that the current approach is correct either—if for no other reason than the fact that Bazel should have a mechanism to auto-detect the library names so that you don't have to specify .so manually (given that on Windows you would need .dll instead).

Would it be possible for you to lend me a hand at fixing some issues? (Do you happen to have a Windows machine to test on, or any guesses as to how to fix them?)

Or even better—is there any chance you could switch to static libraries? That might make things much, much easier.

@raulchen
Copy link
Contributor

@mehrdadn sorry about these problems.
@chaokunyang can help dig into this issue. He is more familiar with that.
I think if this issue turns out to be hard to fix, we can at least disable building streaming for Windows and make sure Ray core is okay.

@mehrdadn
Copy link
Contributor Author

Oh that's a great idea if that's possible. Right now it's a dependency of ray_pkg so I assumed it's required, but if they can be separated that could also be helpful for now. (I'm not really aware of what targets are required for what purposes, so it's hard for me to judge these myself unfortunately.)

@chaokunyang
Copy link
Member

@mehrdadn I'll disable building streaming for Windows for now. And looking into whether we can use exported dynamic libraries symbols in windows.

@mehrdadn
Copy link
Contributor Author

@chaokunyang Okay sure, thanks! Feel free to just push a commit here if that's easier.

@chaokunyang
Copy link
Member

@mehrdadn You can try this, I don't have a windows machine for this test, hope it works.

diff --git a/BUILD.bazel b/BUILD.bazel
index ef6db4d88..ca30a0fa9 100644
--- a/BUILD.bazel
+++ b/BUILD.bazel
@@ -951,11 +951,22 @@ py_library(
     visibility = ["__subpackages__"],
 )

+genrule(
+    name = "cp_streaming_lib",
+    srcs = ["python/ray/streaming/_streaming.so"],
+    outs = ["cp_streaming_lib.out"],
+    cmd = """
+        set -x &&
+        WORK_DIR=$$(pwd) &&
+        cp -f $(location python/ray/streaming/_streaming.so) $$WORK_DIR/python/ray/streaming &&
+        echo "$$WORK_DIR" > $@    
+    """,
+)
+
 genrule(
     name = "ray_pkg",
     srcs = [
         "python/ray/_raylet.so",
-        "python/ray/streaming/_streaming.so",
         "//:python_sources",
         "//:all_py_proto",
         "//:redis-server",
@@ -965,13 +976,15 @@ genrule(
         "//:raylet_monitor",
         "@plasma//:plasma_store_server",
         "//streaming:copy_streaming_py_proto",
-    ],
+    ] + select({ 
+        "@bazel_tools//src/conditions:windows": [],  # ignore build streaming for windows
+        "//conditions:default": [":cp_streaming_lib"],
+    }),
     outs = ["ray_pkg.out"],
     cmd = """
         set -x &&
         WORK_DIR=$$(pwd) &&
         cp -f $(location python/ray/_raylet.so) "$$WORK_DIR/python/ray" &&
-        cp -f $(location python/ray/streaming/_streaming.so) "$$WORK_DIR/python/ray/streaming" &&
         mkdir -p "$$WORK_DIR/python/ray/core/src/ray/thirdparty/redis/src/" &&
         cp -f $(location //:redis-server) "$$WORK_DIR/python/ray/core/src/ray/thirdparty/redis/src/" &&
         cp -f $(location //:redis-cli) "$$WORK_DIR/python/ray/core/src/ray/thirdparty/redis/src/" &&

@mehrdadn
Copy link
Contributor Author

@chaokunyang I think it's working! I just pushed a new commit with this change, I think it should be good to merge on my end if it builds fine on other platforms. Thanks a ton! :)

@chaokunyang
Copy link
Member

@mehrdadn Please add local = 1, for cp_streaming_lib to disable sandbox, otherwise it will fail in non-windows platform. Thanks

@mehrdadn
Copy link
Contributor Author

@chaokunyang Done, thanks!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19461/
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19455/
Test PASSed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19459/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19467/
Test FAILed.

@mehrdadn
Copy link
Contributor Author

Actually, @chaokunyang I have a question for you:

As far as I can tell (though I'm not entirely sure), Clang doesn't support -fno-gnu-unique, and hence it prevents building with Clang.
One obvious workaround is to try to special-case this for GCC, but I presume omitting it would break something... is that correct?
If so, do you know if there is a workaround for Clang? If there isn't, then should we block Clang builds?
Or if not, then is it something that can be removed on GCC as well?

@chaokunyang
Copy link
Member

chaokunyang commented Dec 13, 2019

Hi @mehrdadn, -fno-gnu-unique can be removed now.

@mehrdadn
Copy link
Contributor Author

Oh awesome, okay thanks @chaokunyang! I just removed it.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/19573/
Test FAILed.

@pcmoritz pcmoritz merged commit 7a24144 into ray-project:master Dec 17, 2019
@mehrdadn mehrdadn deleted the bazel-fix branch December 17, 2019 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants