Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jenkins: laggy file system #2984

Open
markus2330 opened this issue Sep 19, 2019 · 18 comments

Comments

@markus2330
Copy link
Contributor

commented Sep 19, 2019

https://build.libelektra.org/jenkins/blue/organizations/jenkins/libelektra/detail/master/960/pipeline

wrapper script does not seem to be touching the log file in /home/jenkins/workspace/libelektra_master-Q2SIBK3KE2NBEMJ4WVGJXAXCSCB77DUBUULVLZDKHQEV3WNDXBMA/build directory@tmp/durable-baf61d6c

(JENKINS-48300: if on a laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=300)
@sanssecours sanssecours referenced this issue Sep 19, 2019
2 of 11 tasks complete
@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

This timeout error only happens on i7 by the way, which might be not powerful enough to handle the workload. Maybe we should decrease the number of simultaneous jobs for make on that machine?

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

Changing the number of simultaneous jobs only one machine is not easy anymore. I am also not sure if it will fix the problem. The load is low. (Even though currently shell_recorder is running.)

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

Ok, now clang is running and the load is 8.74, 4.78, 4.21

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

I know used the Script Console to set

org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL

to the value 300 via:

System.setProperty("org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL", "300");

as suggested in the error message:

(JENKINS-48300: if on a laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=300)

. Hopefully that update will get rid of the problem.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 19, 2019

Thank you! Is this only for i7 or Jenkins in general now?

Will the setting survive the next restart?

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 19, 2019

Is this only for i7 or Jenkins in general now?

I think this is for all machines.

Will the setting survive the next restart?

I do not think so. According to this Stack Overflow article we can also set this property in our Jenkinsfile or various other places.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

Ok, so a permanent solution will be to add it to the Jenkinsfile.

Let us see if this now fixes the problem. So far it looks good, doesn't it?

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 20, 2019

Ok, so a permanent solution will be to add it to the Jenkinsfile.

Probably.

Let us see if this now fixes the problem. So far it looks good, doesn't it?

I did not see the same problem since yesterday. However, the Jenkins build does fail often because of various other issues too.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 20, 2019

Ok, I'll leave the issue open so that we do not forget to add this to the Jenkinsfile.

@Mistreated maybe you can take that task?

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 22, 2019

Ok, I'll leave the issue open so that we do not forget to add this to the Jenkinsfile.

Seems like setting

org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL

to a value of 300 in the main Script Console did not solve problem. I executed

System.getProperty("org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL");

in the Script Console of i7-debian-buster. The empty result of this command indicates that the value was not set on i7-debian-buster. Consequently I executed

System.setProperty("org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL", "300");

in the Script Console of i7. Let us hope this update fixes the issue.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 22, 2019

Thank you!

What about adding to the Jenkinsfile? Then it will also survive restarts.

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 22, 2019

What about adding to the Jenkinsfile?

That would be probably be a good solution, but only if we know that setting HEARTBEAT_CHECK_INTERVAL really gets rid of the problem.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

Is this fixed now? We should add it to the Jenkinsfile so that it will survive the next reboot?

@sanssecours

This comment has been minimized.

Copy link
Member

commented Sep 27, 2019

Is this fixed now?

Nope. One issue seems to be that HEARTBEAT_CHECK_INTERVAL does not seem to stick, if I set the value via the Script Console. It would be great, if you can add the value to the Java arguments for Jenkins on i7-debian-buster.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Sep 27, 2019

The Java arguments for the agents must be settable somehow via Jenkinsfile or Jenkins Configuration. Only the Java arguments for Jenkins itself (build.libelektra.org) must be set by a script not under the control of the Jenkins configuration.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Oct 2, 2019

Occurred in https://build.libelektra.org/jenkins/blue/organizations/jenkins/libelektra/detail/cleanup-scripts/1/pipeline running on a7

Is the HEARTBEAT_CHECK_INTERVAL still active there?

@sanssecours

This comment has been minimized.

Copy link
Member

commented Oct 2, 2019

Is the HEARTBEAT_CHECK_INTERVAL still active there?

I do not think so. I am also not sure it is active anywhere, since

System.getProperty("org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL");

always seems to return an empty result after some time. We probably should find another way to set this property. I do not think adding

script {
    System.setProperty("org.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL", "300");
}

to the Jenkinsfile and allowing “In-process Script approval”, as described here is the right way, though.

@markus2330

This comment has been minimized.

Copy link
Contributor Author

commented Oct 2, 2019

Ok, let us try out to pass the parameter to the Java process in the new build server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.