-
-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System unavailable: test-azure-win2012r2-x64-1: ERROR: Cannot delete workspace #2209
Comments
This may be related with test cases. Job was cancelled by timeout and tests thread hangs. |
This will stop some hanging tests from running: adoptium/aqa-tests#2639 |
Same issue with test-ibmcloud-win2012r2-x64-1 , which failed job https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_sanity.openjdk_x86-64_windows/508/ adoptium/aqa-tests#2639 will stop new hanging happen. We will need some process to clean all window workspace so the error won't happen again and no job will fail for this. |
Was hoping that the processCheck (see adoptium/aqa-tests#2059) would cover us for being able to delete workspace, wonder if it means that not all processes are being killed, so therefore can not delete workspace? |
The kill process is in try catch finally block https://github.com/adoptium/aqa-tests/blob/master/buildenv/jenkins/JenkinsfileBase#L736. My guess is if job terminates due to timeout the kill process will not run at all. |
The 3 machines, test-azure-win2012r2-x64-1 and -3, and test-ibmcloud-win2012r2-x64-1 have hanging processes which I can't delete as the jenkins user. The processes are
respectively. I get a permissions error when trying to delete them as the Jenkins user |
I was able to kill the process running on test-ibmcloud-win2012r2-x64-1 Running the failed jenkins job on the machine |
The hanging processes on the azure machines were ended too |
Running https://ci.adoptopenjdk.net/job/Test_openjdk8_dragonwell_extended.openjdk_x86-64_windows_testList_0/9/console on test-azure-win2012r2-x64-1 and https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_extended.functional_x86-64_windows/140/console on test-azure-win2012r2-x64-3 |
https://ci.adoptopenjdk.net/job/Test_openjdk11_j9_extended.functional_x86-64_windows/140/console failed on test-azure-win2012r2-x64-3 because it couldn't delete the workspace, strange
|
On test-azure-win2012r2-x64-3, I've deleted the problem workspace, ended the hanging process and kicked off another job |
The job on test-ibmcloud-win2012r2-x64-1 ran without a workspace error or hanging processes. Running another job on the same machine to observe any 'failure to delete workspace' errors https://ci.adoptopenjdk.net/job/Test_openjdk8_hs_sanity.openjdk_x86-64_windows/513/console |
test-ibmcloud-win2012r2-x64-1 looks to be fine. Two sanity jobs ran back to back without a workspace complaint or a hanging process |
Test_openjdk8_dragonwell_extended.openjdk_x86-64_windows_testList_0 hung twice on test-azure-win2012r2-x64-1. this being the most recent. I ran extended openjdk tests on a dragonwell jdk on the machine directly, and this ran to completion without hanging. I ran https://ci.adoptopenjdk.net/view/Test_openjdk/job/Test_openjdk8_hs_extended.openjdk_x86-64_windows_testList_0/12/console on the same machine to test. This ran to completion without hanging |
The other aborted Test_openjdk8_dragonwell_extended.openjdk_x86-64_windows_testList_0 job on test-azure-win2012r2-x64-1 was https://ci.adoptopenjdk.net/job/Test_openjdk8_dragonwell_extended.openjdk_x86-64_windows_testList_0/9 Both jobs hung for 4 hours during |
hotspot_jre_0 grinder from yesterday on test-azure-win2012r2-x64-1, https://ci.adoptopenjdk.net/job/Grinder/1159/console. Jenkins aborted it a after it hung for 10 hours |
Rerunning on the same machine without a dragonwell build |
This ran to completion, didn't hang nor did it leave any hanging processes on the machine |
Same grinder ran on test-ibmcloud-win2012r2-x64-1, https://ci.adoptopenjdk.net/job/Grinder/1160/console. The test didn't pass, but the grinder did not hang as it does on test-azure-win2012r2-x64-1 |
Re ran the job on test-azure-win2012r2-x64-3 https://ci.adoptopenjdk.net/job/Grinder/1162/console. Test doesn't hang. Seems to be a problem with -1 only |
I think the test class that causes the hang is TestNoMinidumpAtFullGC. It appears as the last test before the job hangs, on all 3 occasions. Despite the directory being |
Machine currently offline due to low disk space |
There was a |
As mentioned above the machine is quite low on disk space with 1Gb in |
Machine is back online with enough space. Closing |
Jobs are failing with:
e.g. https://ci.adoptopenjdk.net/job/Grinder/804/console
The text was updated successfully, but these errors were encountered: