Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows CI failure #2977

Closed
kchodorow opened this issue May 9, 2017 · 14 comments
Closed

Windows CI failure #2977

kchodorow opened this issue May 9, 2017 · 14 comments
Assignees
Labels
breakage P1 I'll work on this now. (Assignee required) platform: windows

Comments

@kchodorow
Copy link
Contributor

http://ci.bazel.io/view/Bazel%20bootstrap%20and%20maintenance/job/Bazel/JAVA_VERSION=1.8,PLATFORM_NAME=windows-msvc-x86_64/1482/console

Might be a flake? The same error happened last week.

____[1,533 / 1,653] Compiling Java headers src/main/java/com/google/devtools/build/lib/libsyntax-hjar.jar (76 files)
ERROR: C:/jenkins/workspace/bazel/java_version/1.8/platform_name/windows-msvc-x86_64/src/main/java/com/google/devtools/build/lib/BUILD:415:1: Compiling Java headers src/main/java/com/google/devtools/build/lib/libsyntax-hjar.jar (76 files) failed: java.exe failed: error executing command external/local_jdk/bin/java.exe -Xverify:none -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r3297-4.jar -jar ... (remaining 2 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.
Exception in thread "main" java.nio.file.AccessDeniedException: bazel-out\vc_14_0_x64-fastbuild\bin\src\main\java\com\google\devtools\build\lib\libsyntax-hjar.jar
	at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
	at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
	at sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:230)
	at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
	at java.nio.file.Files.newOutputStream(Files.java:216)
	at com.google.devtools.build.java.turbine.javac.JavacTurbine.emitClassJar(JavacTurbine.java:275)
	at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:224)
	at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:84)
	at com.google.devtools.build.java.turbine.Turbine.javacTurbineCompile(Turbine.java:74)
	at com.google.devtools.build.java.turbine.Turbine.compile(Turbine.java:62)
	at com.google.devtools.build.java.turbine.Turbine.main(Turbine.java:33)
@kchodorow kchodorow added breakage platform: windows P1 I'll work on this now. (Assignee required) labels May 9, 2017
@laszlocsomor
Copy link
Contributor

The same error happened last week.

Indeed, it was http://ci.bazel.io/view/Bazel%20bootstrap%20and%20maintenance/job/Bazel/1449/

The error is AccessDeniedException, suggesting the file could not be (over)written.

(My first hypothesis, that I leave here for posterity) I think this is the bug where workers hold on to open files, that @philwo fixed a while ago. @dslomov pointed out that this was the Bazel job therefore it used bootstrap bazel, which is still 0.4.5 that. Turbine is an embedded binary so 0.4.5 may still have the pre-bugfix jar embedded in it. If this theory is correct then releasing 0.5.0 will resolve this issue, otherwise we can rule out that possibility and look for a different root cause.

The only way I could trigger an AccessDeniedException was through creating the output file and making it read-only. If the file exists and is open by another process, then we get a simple IOException. So somehow the file was marked read-only since the last CI run.

@laszlocsomor
Copy link
Contributor

laszlocsomor commented May 10, 2017

@cushon -- FYI JimFS issues on Windows, scroll down for more info

I did an experiment with Turbine built at HEAD and with an older one.

In both cases, I built a custom bazel binary, ran it once so it'd extract its embedded tools, moved the turbine jar and created an empty file instead (set the mtime to 10 years in the future so bazel wouldn't notice my tampering) then built bazel with this custom bazel and corrupt turbine jar with --verbose_failures. This way I guaranteed that exactly the turbine actions would fail and I could grab the command as well as the params file, which I then edited with output paths under my control, where I created the output files and set them read-only.

Turbine at HEAD (65b0612):

C:\tmp-cmd\o6Hbs7N0\execroot\bazel>external\local_jdk\bin\java.exe -Xverify:none -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external\bazel_tools\third_party\java\jdk\langtools\javac-9-dev-r4023-1.jar -jar external/bazel_tools/tools/jdk/turbine_deploy.jar @c:\tempdir\turbine.params
java.nio.file.InvalidPathException: Jimfs does not currently support the Windows syntax for an absolute path on the current drive (e.g. "\foo\bar": /classes
        at com.google.common.jimfs.WindowsPathType.parsePath(WindowsPathType.java:79)
        at com.google.common.jimfs.PathService.parsePath(PathService.java:209)
        at com.google.common.jimfs.JimfsFileSystem.getPath(JimfsFileSystem.java:270)
        at com.google.common.jimfs.JimfsFileSystem.getPath(JimfsFileSystem.java:171)
        at com.google.devtools.build.java.turbine.javac.JavacTurbineCompiler.compile(JavacTurbineCompiler.java:71)
        at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:197)
        at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:79)
        at com.google.devtools.build.java.turbine.Turbine.javacTurbineCompile(Turbine.java:74)
        at com.google.devtools.build.java.turbine.Turbine.compile(Turbine.java:62)
        at com.google.devtools.build.java.turbine.Turbine.main(Turbine.java:33)

Params file:

C:\tmp-cmd\o6Hbs7N0\execroot\bazel>type c:\tempdir\turbine.params
--output
c:\tempdir\turbine-out-hjar.jar
--output_deps
c:\tempdir\turbine-out-hjar.jdeps
--temp_dir
c:\tempdir\turbine-out-temp
--bootclasspath
external\local_jdk\jre\lib\rt.jar
external\local_jdk\jre\lib\resources.jar
external\local_jdk\jre\lib\jsse.jar
external\local_jdk\jre\lib\jce.jar
external\local_jdk\jre\lib\charsets.jar
external\local_jdk\jre\lib\ext\access-bridge-64.jar
external\local_jdk\jre\lib\ext\cldrdata.jar
external\local_jdk\jre\lib\ext\dnsns.jar
external\local_jdk\jre\lib\ext\jaccess.jar
external\local_jdk\jre\lib\ext\localedata.jar
external\local_jdk\jre\lib\ext\nashorn.jar
external\local_jdk\jre\lib\ext\sunec.jar
external\local_jdk\jre\lib\ext\sunjce_provider.jar
external\local_jdk\jre\lib\ext\sunmscapi.jar
external\local_jdk\jre\lib\ext\sunpkcs11.jar
external\local_jdk\jre\lib\ext\zipfs.jar
--sources
src\main\java\com\google\devtools\build\lib\util\OS.java
--javacopts
-source
8
-target
8
-encoding
UTF-8
-XDskipDuplicateBridges=true
--rule_kind
java_library
--target_label
//src/main/java/com/google/devtools/build/lib:os_util
--classpath

So apparently the new JimFS code isn't working on Windows -- @cushon FYI. That was introduced in 58a615c, so that was my commit hash I used its parent's commit hash for the old Turbine:

C:\tmp-cmd2\o6Hbs7N0\execroot\bazel>external\local_jdk\bin\java.exe -Xverify:none -XX:+TieredCompilation -XX:TieredStopAtLevel=1 -Xbootclasspath/p:external/bazel_tools/third_party/java/jdk/langtools/javac-9-dev-r3297-4.jar -jar external/bazel_tools/tools/jdk/turbine_deploy.jar.bak @c:\tempdir\turbine-old.params
Exception in thread "main" java.nio.file.AccessDeniedException: c:\tempdir\turbine-old-out-hjar.jar
        at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:83)
        at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
        at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
        at sun.nio.fs.WindowsFileSystemProvider.newByteChannel(WindowsFileSystemProvider.java:230)
        at java.nio.file.spi.FileSystemProvider.newOutputStream(FileSystemProvider.java:434)
        at java.nio.file.Files.newOutputStream(Files.java:216)
        at com.google.devtools.build.java.turbine.javac.JavacTurbine.emitClassJar(JavacTurbine.java:270)
        at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:219)
        at com.google.devtools.build.java.turbine.javac.JavacTurbine.compile(JavacTurbine.java:81)
        at com.google.devtools.build.java.turbine.Turbine.javacTurbineCompile(Turbine.java:74)
        at com.google.devtools.build.java.turbine.Turbine.compile(Turbine.java:62)
        at com.google.devtools.build.java.turbine.Turbine.main(Turbine.java:33)

Params file:

C:\tmp-cmd\o6Hbs7N0\execroot\bazel>type c:\tempdir\turbine-old.params
--output
c:/tempdir/turbine-old-out-hjar.jar
--output_deps
c:/tempdir/turbine-old-out-hjar.jdeps
--temp_dir
bazel-out/msvc_x64-fastbuild/bin/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/_javac/invalid_command_line_exception/libinvalid_command_line_exception-hjar_temp
--bootclasspath
external/local_jdk/jre/lib/rt.jar
external/local_jdk/jre/lib/resources.jar
external/local_jdk/jre/lib/jsse.jar
external/local_jdk/jre/lib/jce.jar
external/local_jdk/jre/lib/charsets.jar
external/local_jdk/jre/lib/ext/access-bridge-64.jar
external/local_jdk/jre/lib/ext/cldrdata.jar
external/local_jdk/jre/lib/ext/dnsns.jar
external/local_jdk/jre/lib/ext/jaccess.jar
external/local_jdk/jre/lib/ext/jfxrt.jar
external/local_jdk/jre/lib/ext/localedata.jar
external/local_jdk/jre/lib/ext/nashorn.jar
external/local_jdk/jre/lib/ext/sunec.jar
external/local_jdk/jre/lib/ext/sunjce_provider.jar
external/local_jdk/jre/lib/ext/sunmscapi.jar
external/local_jdk/jre/lib/ext/sunpkcs11.jar
external/local_jdk/jre/lib/ext/zipfs.jar
--sources
src/java_tools/buildjar/java/com/google/devtools/build/buildjar/InvalidCommandLineException.java
--javacopts
-source
8
-target
8
-encoding
UTF-8
-XDskipDuplicateBridges=true
--rule_kind
java_library
--target_label
//src/java_tools/buildjar/java/com/google/devtools/build/buildjar:invalid_command_line_exception
--classpath

This error looks like the one in @kchodorow 's original post.
I don't know if setting the file to read-only is the only way to trigger the AccessDeniedException, maybe it isn't. But so far this is my most plausible explanation. I'm a bit worried about confirmation bias though.

@laszlocsomor
Copy link
Contributor

Thus it's confirmed that @kchodorow saw an error with Bazel 0.4.5 and older Turbine, and my theory is that something set Turbine's previous output files to be read-only.

@damienmg : Do you think it's possible that Jenkins set some action's outputs as read-only between CI runs?

@cushon
Copy link
Contributor

cushon commented May 10, 2017

I mailed a fix for the windows path issue, I'm not sure about the AccessDeniedException.

@kchodorow
Copy link
Contributor Author

@cushon
Copy link
Contributor

cushon commented May 10, 2017

The action responsible for those turbine invocations does a two-tiered spawn where it retries failing invocations with different options:

protected void internalExecute(ActionExecutionContext actionExecutionContext)

I wonder if that could be causing it to try to write the same output twice? I'm not sure why that would be non-deterministic or only affect windows, though.

@laszlocsomor
Copy link
Contributor

@cushon : Thanks again! We may want to release a Turbine deploy jar as soon as your bugfix is in, and cherrypick that into bazel 0.5.0

@laszlocsomor
Copy link
Contributor

laszlocsomor commented May 11, 2017

(see #2692)

@cushon
Copy link
Contributor

cushon commented May 11, 2017

@laszlocsomor I don't think you need a deploy jar - the turbine code is split across https://github.com/google/turbine (which is checked in to the Bazel repo as a jar in third_party), and src/java_tools/buildjar/java/com/google/devtools/build/java/turbine which is rebuilt from source with Bazel. This fix was in the latter.

bazel-io pushed a commit that referenced this issue May 11, 2017
@dslomov
Copy link
Contributor

dslomov commented May 11, 2017 via email

@laszlocsomor
Copy link
Contributor

Is it possible to switch to 5.0rc bazel.exe on Windows as a bootstrapper?

Done. I logged in to all 4 Windows workers, downloaded the msys-less 0.5.0rc6 binary into c:/bazel_ci/installs/0.5.0rc6-msvc/bazel.exe, and updated the c:/bazel_ci/installs/{bootstrap,latest} junctions to point to 0.5.0rc6-msvc.

The first CI run to use the new binary will be http://ci.bazel.io/view/Bazel%20bootstrap%20and%20maintenance/job/Bazel/1497/

@laszlocsomor
Copy link
Contributor

bazel-io pushed a commit that referenced this issue Jun 7, 2017
This will help diagnose
#2977,
which I believed to have been fixed by the release
of 0.5.0, however it seems to have failed once
more with 0.5.0, alas without knowing the actual
bootstrap bazel's version I cannot be sure.

Change-Id: I71e100c549b4ef30699efe6363b72eb792ad1c23
PiperOrigin-RevId: 158243584
@laszlocsomor
Copy link
Contributor

I haven't seen this bug in a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breakage P1 I'll work on this now. (Assignee required) platform: windows
Projects
None yet
Development

No branches or pull requests

4 participants