Skip to content
This repository has been archived by the owner on May 15, 2018. It is now read-only.

ERROR: Cannot delete workspace: java.nio.file.FileSystemException #358

Closed
whimboo opened this issue Nov 9, 2013 · 11 comments
Closed

ERROR: Cannot delete workspace: java.nio.file.FileSystemException #358

whimboo opened this issue Nov 9, 2013 · 11 comments

Comments

@whimboo
Copy link
Contributor

whimboo commented Nov 9, 2013

Strangely I haven't found an issue for this. So filing it now.

It happened today again on Windows 8.1 64-1 when it tried to run functional tests for Aurora. It looks like this is the only testrun which is affected by this problem on this node. Other testruns were run fine, and even the aurora one succeeded on other nodes.

Here the output:

04:08:48 Building remotely on mm-win-81-64-1 in workspace c:\jenkins\workspace\mozilla-aurora_functional
04:08:48
04:08:58 Deleting project workspace... Cannot delete workspace: java.nio.file.FileSystemException: c:\jenkins\workspace\mozilla-aurora_functional\mozmill-env\python\Lib\site-packages\mercurial\extensions.py: The process cannot access the file because it is being used by another process.
04:08:58
04:08:58 ERROR: Cannot delete workspace: java.nio.file.FileSystemException: c:\jenkins\workspace\mozilla-aurora_functional\mozmill-env\python\Lib\site-packages\mercurial\extensions.py: The process cannot access the file because it is being used by another process.
04:08:58

I have seen that the file in question has a size of 0.

@whimboo
Copy link
Contributor Author

whimboo commented Nov 9, 2013

I used the Process Explorer to check if something is wrong with that file but it doesn't look like that. Then I rechecked with handle and voila. The only process which has a handle to that file is the running javaw.exe process. No other file has a handle open to it. That means that Java or Jenkins itself keeps a handle open to that file.

Restarting Jenkins here was totally enough. I haven't had to delete the workspace at all. I assume that the same is also the case for all the other instances.

Sadly I wasn't able to check the job which executed that testrun the last time on this box, given that it happened on Nov 2nd. I cannot access this anymore through the Jenkins CI.

@whimboo
Copy link
Contributor Author

whimboo commented Nov 9, 2013

The Jenkins issue for this is https://issues.jenkins-ci.org/browse/JENKINS-19994

@davehunt
Copy link
Member

From looking at the plugin source code, it appears that we're hitting this exception in Jenkins core. I've added a log record for the plugin here: http://mm-ci-master.qa.scl3.mozilla.com:8080/log/Workspace%20Cleanup%20plugin/ so we should hopefully see a full stack trace next time this happens so we can find out where this exception is thrown.

If we reset/restart production the logger may need to be re-added. This can be done by visiting http://mm-ci-master.qa.scl3.mozilla.com:8080/log/ and clicking 'Add new log recorder'. Name: 'Workspace Cleanup plugin', logger: 'hudson.plugins.ws_cleanup.PreBuildCleanup'

@whimboo
Copy link
Contributor Author

whimboo commented Nov 14, 2013

Thanks Dave! If that problem doesn't reoccur until Monday, I will try to force it by restarting a win8 vm at home. I would assume it should be reproducible on other systems (beside production) too.

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

This happened again out of sudden during the ondemand update tests today. The previous ondemand_update test from earlier today run fine without any errors or warnings. So I'm not sure what has been caused this. Restarting Jenkins slave on mm-win-8-32-1 fixed the problem again.

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

Failure as seen in the Jenkins log for the workspace plugin:

Nov 25, 2013 12:03:45 PM hudson.plugins.ws_cleanup.PreBuildCleanup preCheckout
SEVERE: null
hudson.util.IOException2: remote file operation failed: c:\jenkins\workspace\ondemand_update at hudson.remoting.Channel@84ffe2:mm-win-8-32-1
at hudson.FilePath.act(FilePath.java:906)
at hudson.FilePath.act(FilePath.java:883)
at hudson.FilePath.deleteRecursive(FilePath.java:1042)
at hudson.plugins.ws_cleanup.PreBuildCleanup.preCheckout(PreBuildCleanup.java:84)
at jenkins.scm.SCMCheckoutStrategy.preCheckout(SCMCheckoutStrategy.java:78)
at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:559)
at hudson.model.Run.execute(Run.java:1592)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:237)
Caused by: java.nio.file.FileSystemException: c:\jenkins\workspace\ondemand_update\mozmill-env-windows\python\Lib\test\decimaltestdata\power.decTest: The process cannot access the file because it is being used by another process.
at sun.nio.fs.WindowsException.translateToIOException(Unknown Source)
at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.WindowsException.rethrowAsIOException(Unknown Source)
at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
at sun.nio.fs.AbstractFileSystemProvider.delete(Unknown Source)
at java.nio.file.Files.delete(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at hudson.Util.deleteFile(Util.java:241)
at hudson.Util.deleteRecursive(Util.java:309)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.Util.deleteContentsRecursive(Util.java:206)
at hudson.Util.deleteRecursive(Util.java:300)
at hudson.FilePath$11.invoke(FilePath.java:1045)
at hudson.FilePath$11.invoke(FilePath.java:1042)
at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2394)
at hudson.remoting.UserRequest.perform(UserRequest.java:118)
at hudson.remoting.UserRequest.perform(UserRequest.java:48)
at hudson.remoting.Request$2.run(Request.java:326)
at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at hudson.remoting.Engine$1$1.run(Engine.java:63)
at java.lang.Thread.run(Unknown Source)

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

I have a hard time reading this stack. But if I see it correctly we already are failing before the cleanup_workspace plugin is called. Could this be a bug in Util.java? @davehunt what do you think?

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

So in the above case the problematic file is again of size 0 and has been accessed the last time at 1:05am today. By that time we haven't run any job on that machine!! So some background task could have caused this problem. While thinking about that I can remember that we had a couple of rotate log failures. Not sure if that could be related.

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

Indeed. The rotate log task is also failing:

SEVERE: Failed to rotate log
java.io.IOException: /data/mozmill-ci/./jenkins-master/jobs/ondemand_update/builds/2013-11-19_11-39-38 is in use
at hudson.model.Run.delete(Run.java:1381)
at hudson.tasks.LogRotator.perform(LogRotator.java:133)
at hudson.model.Job.logRotate(Job.java:407)
at hudson.model.Run.execute(Run.java:1655)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:237)

@whimboo
Copy link
Contributor Author

whimboo commented Nov 25, 2013

@davehunt would there be an easy way to check if one of those two try blocks are causing the 0 Byte size and leaving open handles behind?

https://github.com/jenkinsci/jenkins/blob/1.509/core/src/main/java/hudson/Util.java#L275

@davehunt
Copy link
Member

I'm not sure, it's probably best to escalate this to the Jenkins dev team.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants