Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins unable to clean up Windows host workspaces #477

Closed
karianna opened this issue Sep 6, 2018 · 11 comments
Closed

Jenkins unable to clean up Windows host workspaces #477

karianna opened this issue Sep 6, 2018 · 11 comments
Assignees
Labels
Milestone

Comments

@karianna
Copy link
Contributor

karianna commented Sep 6, 2018

Due to not having the permission to delete certain files.
https://ci.adoptopenjdk.net/administrativeMonitor/AsyncResourceDisposer/

@karianna karianna added the bug label Sep 6, 2018
@sxa
Copy link
Member

sxa commented Sep 7, 2018

@karianna I'm not sure who's best placed to look at this - can you suggest someone? I've not seen that job before and I can't see where it's definition is (There's no configure link on the UI so I'm assuming it's magic)

@mleipe
Copy link

mleipe commented Sep 7, 2018

Something you can help with, @AdamBrousseau ? I know you've got some experience in this area.

@AdamBrousseau
Copy link
Contributor

I looked at this issue briefly yesterday but the description is lacking any details and the link posted is not public.

@karianna
Copy link
Contributor Author

karianna commented Sep 7, 2018

Resource State Tracked since 
Workspace build-azure-win2008r2-x64-1:C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864 java.io.IOException: Unable to delete 'C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-05 13:00
Workspace build-azure-win2008r2-x64-1:C:\Users\jenkins\workspace\openjdk8_build_x86-32_windows_ws-cleanup_1536146405464 java.io.IOException: Unable to delete 'C:\Users\jenkins\workspace\openjdk8_build_x86-32_windows_ws-cleanup_1536146405464\openjdk\build\windows-x86-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-05 13:20
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536147014792 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536147014792\openjdk\build\windows-x86-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-05 13:30
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536149256802 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536149256802\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-05 14:07
Workspace build-azure-win2008r2-x64-1:C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536231615052 java.io.IOException: Unable to delete 'C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536231615052\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-06 13:00
Workspace build-azure-win2008r2-x64-1:C:\Users\jenkins\workspace\openjdk8_build_x86-32_windows_ws-cleanup_1536232119465 java.io.IOException: Unable to delete 'C:\Users\jenkins\workspace\openjdk8_build_x86-32_windows_ws-cleanup_1536232119465\openjdk\build\windows-x86-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-06 13:08
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536234234602 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536234234602\openjdk\build\windows-x86-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-06 13:43
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536236444707 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536236444707\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-06 14:20
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536319812551 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536319812551\openjdk\build\windows-x86-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-07 13:30
Workspace build-softlayer-win2012r2-x64-1:C:\Users\jenkins\jdk8_j9_ws-cleanup_1536322112399 java.io.IOException: Unable to delete 'C:\Users\jenkins\jdk8_j9_ws-cleanup_1536322112399\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts. 2018-09-07 14:08
Workspace master:/home/jenkins/.jenkins/workspace/openjdk_release_tool_ws-cleanup_1536326818125 To dispose 2018-09-07 15:26

@karianna
Copy link
Contributor Author

karianna commented Sep 7, 2018

Longer stack trace as an example for the 1st item.

java.nio.file.DirectoryNotEmptyException: C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio
	at sun.nio.fs.WindowsFileSystemProvider.implDelete(Unknown Source)
	at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(Unknown Source)
	at java.nio.file.Files.deleteIfExists(Unknown Source)
	at hudson.Util.tryOnceDeleteFile(Util.java:316)
Caused: java.io.IOException: Unable to delete C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio - files in dir: [C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio\_the..]
	at hudson.Util.tryOnceDeleteFile(Util.java:322)
	at hudson.Util.deleteFile(Util.java:272)
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to JNLP4-connect connection from 40.117.211.183/40.117.211.183:55014
		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
		at hudson.remoting.Channel.call(Channel.java:955)
		at hudson.FilePath.act(FilePath.java:1036)
		at hudson.FilePath.act(FilePath.java:1025)
		at hudson.FilePath.deleteRecursive(FilePath.java:1230)
		at hudson.plugins.ws_cleanup.Wipeout.performDelete(Wipeout.java:78)
		at hudson.plugins.ws_cleanup.Wipeout$DisposableImpl.dispose(Wipeout.java:110)
		at org.jenkinsci.plugins.resourcedisposer.AsyncResourceDisposer$WorkItem.run(AsyncResourceDisposer.java:287)
		at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		at java.lang.Thread.run(Thread.java:748)
Caused: java.io.IOException: Unable to delete 'C:\Users\jenkins\workspace\openjdk8_build_x86-64_windows_ws-cleanup_1536145216864\openjdk\build\windows-x86_64-normal-server-release\jdk\gensrc\java\nio'. Tried 3 times (of a maximum of 3) waiting 0.1 sec between attempts.
	at hudson.Util.deleteFile(Util.java:277)
	at hudson.FilePath.deleteRecursive(FilePath.java:1265)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.deleteContentsRecursive(FilePath.java:1274)
	at hudson.FilePath.deleteRecursive(FilePath.java:1256)
	at hudson.FilePath.access$1200(FilePath.java:210)
	at hudson.FilePath$DeleteRecursive.invoke(FilePath.java:1236)
	at hudson.FilePath$DeleteRecursive.invoke(FilePath.java:1232)
	at hudson.FilePath$FileCallableWrapper.call(FilePath.java:2917)
	at hudson.remoting.UserRequest.perform(UserRequest.java:212)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:369)
	at hudson.remoting.InterceptingExecutorService$1.call(InterceptingExecutorService.java:72)
	at java.util.concurrent.FutureTask.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at hudson.remoting.Engine$1.lambda$newThread$0(Engine.java:93)
	at java.lang.Thread.run(Unknown Source)

@ali-ince
Copy link
Contributor

I would suggest deleting current leftover directories (they consume 48GB of space) and add a post-build step to clean the workspace directory (propagating the failure of cleanup to the build itself) to have a better control and feedback of what's going on.

If that sounds ok for you, I can make the changes.

@AdamBrousseau
Copy link
Contributor

I'm not too familiar with the plugin you're using as I've not used it before. Basically what we have done as a workaround at OpenJ9 is to not fail the cleanWs() by using a try/catch. We also have a cleanup job that runs on a schedule and removes all the workspaces from all node in parallel.
The important part of the script is this:

dir("${WORKSPACE}/../") {
    retry(2) {
        sh "ls | grep -v ${JOB_NAME} | xargs rm -rf"
    }
}

I can't recall exactly why the retry is needed but if you only run it once you will get failures, possibly only on Windows.
The grep is there because you will also get a failure if you try and remove the current job's workspace.

@karianna
Copy link
Contributor Author

@ali-ince - sounds good

@ali-ince
Copy link
Contributor

ali-ince commented Sep 11, 2018

Ok, I've spent some time resolving this issue - my suggestions without helping at all :)

I've restored the original behaviour of ws-cleanup plugin, so that the workspace folder is renamed with a suffix of ws-cleanup (some other random stuff too) and created a scheduled task on windows to remove those with the following command;

powershell.exe -command "Remove-Item -Force -Recurse \\?\C:\Users\Jenkins\workspace\*ws-cleanup*"

I've tried deleting the folder with this same command within the build but it didn't help (failed with the same error logged - nio folder contains some weirdly named file _the.. which causes the failure) but also found that if that command is executed with administrative privileges, it succeeds.

Will check whether other windows build hosts are suffering from the same problem or not.

@AdamBrousseau
Copy link
Contributor

When is the schedule set to run? How will you guarantee there is currently no job executing on the slave?

@ali-ince
Copy link
Contributor

ali-ince commented Sep 11, 2018

Sorry @AdamBrousseau, I've mistyped the command above (and fixed it now) - it only deletes folders that have ws-cleanup in their name so it shouldn't effect running jobs.

It'll run daily at 09:30pm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants