Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel absence of ZIP64 support cause tensorflow build fail #22390

Closed
fo40225 opened this issue Sep 19, 2018 · 15 comments
Closed

bazel absence of ZIP64 support cause tensorflow build fail #22390

fo40225 opened this issue Sep 19, 2018 · 15 comments
Assignees

Comments

@fo40225
Copy link
Contributor

fo40225 commented Sep 19, 2018

System information

  • Have I written custom code (as opposed to using a stock example script provided in TensorFlow): no
  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10 1803
  • TensorFlow installed from (source or binary): source
  • TensorFlow version (use command below): v1.11.0-rc1
  • Python version: 3.6.5
  • Bazel version (if compiling from source): 0.15.2
  • GCC/Compiler version (if compiling from source): vs2017 15.8 / cl.exe 19.15.26726
  • CUDA/cuDNN version: 9.2.148.1/7.2.1
  • GPU model and memory: 1080ti 11GB
  • Exact command to reproduce:
bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
bazel-bin\tensorflow\tools\pip_package\build_pip_package C:/tmp/tensorflow_pkg

Describe the problem

If build artifact large than 4GB, it can't generate the python package.

A easy way to reproduce the issue

Please note that each additional compute capability significantly increases your build time and binary size. [Default is: 3.5,7.0]: 3.0,3.2,3.5,5.0,5.2,5.3

related link:
#20332 (comment)
#22382
https://github.com/bazelbuild/bazel/blob/0.17.1/third_party/ijar/zip.cc#L74

Source code / logs

Uncompressed input jar has size ???, which exceeds the maximum supported output size 4294967295.
Assuming that ijar will be smaller and hoping for the best.

Unzipping simple_console_for_windows.zip to create runfiles tree...
[./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip or
        ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.zip, and cannot find ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.ZIP, period.

@gunan
Copy link
Contributor

gunan commented Sep 20, 2018

@meteorcloudy This is an interesting issue.
what would be the correct pacman command to install zip64?

@meteorcloudy meteorcloudy self-assigned this Sep 21, 2018
@meteorcloudy
Copy link
Member

The zip file is created by a custom zipper binary built from source in bazel, which doesn't have zip64 support. So installing zip64 from pacman is not going to help.

The root cause here is we are zipping redundant files in ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip because it depends on some other py_binary targets which are also built as a zip file. This results a zip file containing many other zip files, and all of them have //tensorflow:tensorflow_py as a dependency.

I'm trying to figure out a way to exclude redundant files.

@bstriner
Copy link
Contributor

Probably good to clean up what is going into that zip, but could also try to fix the zipper. Zlib seems to have zip64 support just fine, so maybe there is an issue with the bazel zlib client or something? I don't see any related issues on bazelbuild.

@meteorcloudy
Copy link
Member

Yes, ideally bazel's zip tool should support zip64. I filed bazelbuild/bazel#6211

But I prefer to refactor tensorflow's dependency to clean up the zip file, because it will also improve the performance of creating TF pip package a lot.

@meteorcloudy
Copy link
Member

@fo40225 @bstriner Can you try #22483 to see if it fixes the problem?

@fo40225
Copy link
Contributor Author

fo40225 commented Sep 24, 2018

@meteorcloudy

v1.11.0-rc2 + cherry-pick 2a01b6a

It can build with most consumer compute capability.

Thank you for help.

@meteorcloudy
Copy link
Member

@gunan You might want to cherry-pick #22483 into v1.11.0 ?

@gunan
Copy link
Contributor

gunan commented Sep 25, 2018

@angersson if we are doing another RC, yes.
If not, I would wait until 1.12

@meteorcloudy
Copy link
Member

Unfortunately, we have to rollback a part of 77e2686. If anyone encounter this issue again, please build TensorFlow with --define=no_tensorflow_py_deps=true

@aaniin
Copy link

aaniin commented Oct 28, 2018

Building the pip package succeeded for me after including the --define=no_tensorflow_py_deps=true.

Error without "no dependencies" flag (Click to expand) (tf-gpu-src) C:\Users\hbeck3\tf-gpu\tensorflow>bazel-bin\tensorflow\tools\pip_package\build_pip_package C:\Users\hbeck3\tf-gpu\pip-package-cuda10-py3.6.6

Sun Oct 28 13:50:53 WEST 2018 : === Preparing sources in dir: /tmp/tmp.cSTaMzHDFx
Unzipping simple_console_for_windows.zip to create runfiles tree...
[./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip]
End-of-central-directory signature not found. Either this file is not
a zipfile, or it constitutes one disk of a multi-part archive. In the
latter case the central directory and zipfile comment will be found on
the last disk(s) of this archive.
unzip: cannot find zipfile directory in one of ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip or
./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.zip, and cannot find ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.ZIP, period.

(tf-gpu-src) C:\Users\hbeck3\tf-gpu\tensorflow>

Success with "no dependencies" flag (Click to expand) (tf-gpu-src) C:\Users\hbeck3\tf-gpu\tensorflow>bazel build --define=no_tensorflow_py_deps=true --jobs 1 --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package

[...]
INFO: Build completed successfully, 50 total actions

(tf-gpu-src) C:\Users\hbeck3\tf-gpu\tensorflow>bazel-bin\tensorflow\tools\pip_package\build_pip_package C:\Users\hbeck3\tf-gpu\pip-package-cuda10-py3.6.6
[...]
Sun Oct 28 14:53:58 WEST 2018 : === Output wheel file is in: C:\Users\hbeck3\tf-gpu\pip-package-cuda10-py3.6.6

(tf-gpu-src) C:\Users\hbeck3\tf-gpu\tensorflow>

@dannycarrera
Copy link

@aaniin can you please post your setup as I'm still getting the same error even with --define=no_tensorflow_py_deps=true

Thanks!

@anatoly-khomenko
Copy link

Hello @gunan,

Would it be possible to add the flag recommendation here? https://www.tensorflow.org/install/source_windows

I believe this would help many people who try to build Tensorflow from source for Windows.

My goal was compute capability 3.5 support to have it run faster on my laptop.

Here is the PR to docs repository: tensorflow/docs#178

Thank you!
Anatoly

@gunan
Copy link
Contributor

gunan commented Nov 13, 2018

Thank you very much for your PR, it is reviewed, approved and merged!

@Prakash19921206
Copy link

having same issue even after including --define=no_tensorflow_py_deps=true

build succeeds but following command gives error
bazel-bin/tensorflow/tools/pip_package/build_pip_package C:/tmp/tensorflow_pkg

tried in both msys and cmd

both showing same error

Unzipping simple_console_for_windows.zip to create runfiles tree...
[./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip]
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip or
        ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.zip, and cannot find ./bazel-bin/tensorflow/tools/pip_package/simple_console_for_windows.zip.ZIP, period.

tensorflow-copybara pushed a commit that referenced this issue Dec 3, 2018
…enet/BUILD

Address #22390#issuecomment-439610881

PiperOrigin-RevId: 223811309
@Sar-thak-3
Copy link

ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted: Analysis failed
INFO: Elapsed time: 507.631s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (61 packages loaded, 12 targets configured)
    currently loading: tensorflow/lite/python ... (2 packages)

I am getting this error during this both of these commands:-

bazel build --define=no_tensorflow_py_deps=true --config=opt //tensorflow/tools/pip_package:build_pip_package
&
bazel build  --config=opt //tensorflow/tools/pip_package:build_pip_package

I am not able to run build with bazel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants