Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on ppc64le #10306

Closed
brunoalr opened this issue May 30, 2017 · 10 comments
Closed

Build fails on ppc64le #10306

brunoalr opened this issue May 30, 2017 · 10 comments
Labels
stat:contribution welcome Status - Contributions welcome

Comments

@brunoalr
Copy link

brunoalr commented May 30, 2017

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 17.04
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): 359d6f9
  • Bazel version (if compiling from source): 0.4.5-2017-05-25 (@255953740)
  • CUDA/cuDNN version: -
  • GPU model and memory: -
  • Exact command to reproduce: bazel build --verbose_failures --show_package_location //tensorflow/tools/pip_package:build_pip_package

Describe the problem

On a ppc64le machine running Ubuntu 17.04 I am not able to build tensorflow.

Source code / logs

ERROR: /home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/external/io_bazel_rules_closure/java/io/bazel/rules/closure/webfiles/server/BUILD:54:1: Generating SOY v2 Java files @io_bazel_rules_closure//java/io/bazel/rules/closure/webfiles/server:listing_files failed: bash failed: error executing command
  (cd /home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/brosa/bazel/output:/home/brosa/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/external/com_google_template_soy/SoyParseInfoGenerator --outputDirectory=bazel-out/host/genfiles/external/io_bazel_rules_closure/java/io/bazel/rules/closure/webfiles/server --javaPackage=io.bazel.rules.closure.webfiles.server --javaClassNameSource=filename --allowExternalCalls=1 $(cat bazel-out/host/genfiles/external/io_bazel_rules_closure/java/io/bazel/rules/closure/webfiles/server/listing_files__srcs) $(cat bazel-out/host/genfiles/external/io_bazel_rules_closure/java/io/bazel/rules/closure/webfiles/server/listing_files__deps)'): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
Unrecognized option: -client
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Target //tensorflow/tools/pip_package:build_pip_package failed to build


@asimshankar
Copy link
Contributor

That error seems to be coming from your JDK installation

Could you describe your bazel installation process and the JDK you're using (required by bazel)? For example, java -version?

@asimshankar asimshankar added the stat:awaiting response Status - Awaiting response from author label May 31, 2017
@brunoalr
Copy link
Author

brunoalr commented May 31, 2017

I can compile on 97c6203 .

java version:
openjdk version "1.8.0_131"

OpenJDK Runtime Environment (build 1.8.0_131-8u131-b11-0ubuntu1.17.04.1-b11)
OpenJDK 64-Bit Server VM (build 25.131-b11, mixed mode)

This version does not recognize -client:

java -client
Unrecognized option: -client
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Bazel was built from source following the instructions here: https://github.com/PPC64/tensorflow-ppc64-doc

@brunoalr
Copy link
Author

I tried running
bazel --host_jvm_args=-XX:+IgnoreUnrecognizedVMOptions build --config=opt --verbose_failures --show_package_location //tensorflow/tools/pip_package:build_pip_package
as a way to ignore the issue with -client, but I still get the same error.

@brunoalr
Copy link
Author

If I config my jdk installation to ignore the error ( sudo sed -i "\$a-client IGNORE" /usr/lib/jvm/java-1.8.0-openjdk-ppc64el/jre/lib/ppc64le/jvm.cfg ), I got a different error:

ERROR: /home/brosa/tensorflow-upstream/tensorflow/tensorboard/components/tf_color_scale/BUILD:36:1: Executing genrule //tensorflow/tensorboard/components/tf_color_scale:ts failed: bash failed: error executing command
  (cd /home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow && \
  exec env - \
    PATH=/home/brosa/bazel/output:/home/brosa/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games \
    PYTHON_BIN_PATH=/usr/bin/python \
    PYTHON_LIB_PATH=/usr/local/lib/python2.7/dist-packages \
    TF_NEED_CUDA=0 \
    TF_NEED_OPENCL=0 \
  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/genfiles/external/com_microsoft_typescript/tsc.sh --inlineSourceMap --inlineSources --noResolve --declaration --module es6 --outDir bazel-out/local-opt/genfiles/tensorflow/tensorboard/components/tf_color_scale external/com_microsoft_typescript/lib.es6.d.ts external/org_definitelytyped/polymer.d.ts external/org_definitelytyped/webcomponents.js.d.ts bazel-out/local-opt/genfiles/tensorflow/tensorboard/components/tf_imports/d3.d.ts bazel-out/local-opt/genfiles/tensorflow/tensorboard/components/tf_color_scale/bundle.ts'): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 126.
bazel-out/host/genfiles/external/com_microsoft_typescript/tsc.sh: line 6: /home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow/external/org_nodejs/bin/node: cannot execute binary file: Exec format error
bazel-out/host/genfiles/external/com_microsoft_typescript/tsc.sh: line 6: /home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow/external/org_nodejs/bin/node: Success
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 42.819s, Critical Path: 26.22s

Looking closer:

file home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow/external/org_nodejs/bin/node

/home/brosa/.cache/bazel/_bazel_brosa/141a2b9f209d04ad1bc4d9433836a54c/execroot/org_tensorflow/external/org_nodejs/bin/node: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.9, BuildID[sha1]=fb12043414130d6f6928b657f7fdb80264a784c2, not stripped

It seems that external/org_nodejs/bin/node is an architecture-dependent (x64) binary...

@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label May 31, 2017
@npanpaliya
Copy link
Contributor

@brunoalr, please refer this issue. This issue also talks about the error with tsc.sh, may work for you too.

@brunoalr
Copy link
Author

brunoalr commented Jun 1, 2017

@npanpaliya thanks for pointing to that issue. Unfortunately I still can't compile it.

@brunoalr brunoalr changed the title Build fails on ppc64 Build fails on ppc64le Jun 1, 2017
@brunoalr
Copy link
Author

brunoalr commented Jun 1, 2017

I updated the issue's name to make clear it occurs on a ppc64le (little endian) rather than a ppc64 (big endian) machine.

@brunoalr
Copy link
Author

brunoalr commented Jun 1, 2017

Alright, so I managed to build TensorFlow without TensorBoard (https://stackoverflow.com/questions/43119802/can-i-build-tensorflow-without-android-and-without-tensorboard).

That seems to be a strong indication that the issue is caused by distributing architecture-dependent binaries (e.g. x64 binaries for nodejs and protoc) to satifisfy some build dependencies.

Summarizing: it is necessary to check whether the installed jdk supports the "-client" flag and a) upload binaries compiled to different targets or b) implement a fallback solution that uses local binaries instead.

@gunan
Copy link
Contributor

gunan commented Jun 1, 2017

As we do not have access to ppc machines, we do not have official support on this platform.
We will be happy to accept a Pull request resolving this issue.
Therefore, I will mark this as Contributions Welcome.

@gunan
Copy link
Contributor

gunan commented Sep 26, 2018

Looks like the reported initial issue was resolved?
And it seems to me the reported issues may be caused by tensorboard.

@gunan gunan closed this as completed Sep 26, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stat:contribution welcome Status - Contributions welcome
Projects
None yet
Development

No branches or pull requests

5 participants