New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qa/vstart_runner: don't let command run after timeout #56888
qa/vstart_runner: don't let command run after timeout #56888
Conversation
Ceph API test failure was related. Link to it - https://jenkins.ceph.com/job/ceph-api/72373/ Relevant part from consolue output -
|
jenkins test ceph api |
jenkins test api |
1 similar comment
jenkins test make check arm64 |
Ceph API too passed - https://jenkins.ceph.com/job/ceph-api/72470/. This means issue on previous run was unrelated. |
@vshankar Let's merge this PR? Testing with teuthology aren't affected by it, so need to run them. And Ceph API tests are affected but all of them passed. |
Parameter "timeout" is accepted by LocalRemote.run() but the method doesn't do anything about it besides accepting it. Thus, this parameter has no effect. In LocalRemote.run(), pass this parameter to LocalRemoteProcess.wait() and from this method pass it to subprocess.Popen.communicate(). Thus, command will be terminated by subprocess module at seconds specified by "timeout" parameter. IOW, "timeout" parameter will have an effect. Fixes: https://tracker.ceph.com/issues/65533 Signed-off-by: Rishabh Dave <ridave@redhat.com>
This issue was exposed by ceph API test failure. Link to this failure - https://jenkins.ceph.com/job/ceph-api/72373/ Copying traceback below from here https://jenkins.ceph.com/job/ceph-api/72373/consoleFull#-2010121386c212b007-e891-4176-9ee7-2f60eca393b7 - 2024-04-15 12:32:34,808.808 INFO:__main__:> ../src/vstart.sh -n --nolockdep Using guessed paths /home/jenkins-build/build/workspace/ceph-api/build/lib/ ['/home/jenkins-build/build/workspace/ceph-api/qa', '/home/jenkins-build/build/workspace/ceph-api/build/lib/cython_modules/lib.3', '/home/jenkins-build/build/workspace/ceph-api/src/pybind'] Traceback (most recent call last): File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 1552, in <module> exec_test() File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 1402, in exec_test remote.run(args=args, env=vstart_env, timeout=(3 * 60)) File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 452, in run return self._do_run(**kwargs) File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 491, in _do_run proc.wait(timeout) File "/home/jenkins-build/build/workspace/ceph-api/build/../qa/tasks/vstart_runner.py", line 252, in wait out, err = self.subproc.communicate(timeout=timeout) File "/usr/lib/python3.10/subprocess.py", line 1152, in communicate stdout, stderr = self._communicate(input, endtime, timeout) File "/usr/lib/python3.10/subprocess.py", line 2004, in _communicate self._check_timeout(endtime, orig_timeout, stdout, stderr) File "/usr/lib/python3.10/subprocess.py", line 1196, in _check_timeout raise TimeoutExpired( subprocess.TimeoutExpired: Command '../src/vstart.sh -n --nolockdep' timed out after 180 seconds Signed-off-by: Rishabh Dave <ridave@redhat.com>
a045f2f
to
03b3329
Compare
No changes to the patch. Added reference to tracker ticket. |
Leaving link to last Ceph API test run since it was successful - https://jenkins.ceph.com/job/ceph-api/72470 |
Yes, I was waiting on clean jenkins test to give a merge ACK. |
Since Ceph API tests passed twice, I am sure that the dashboard tests has been fixed - https://jenkins.ceph.com/job/ceph-api/72470/, https://jenkins.ceph.com/job/ceph-api/72514/. Proceeding to merge. |
NOTE: Context: this issue was spotted and fixed while working on PR #56065 and PR #56066.
Parameter "timeout" is accepted by LocalRemote.run() but the method
doesn't do anything about it besides accepting it. Thus, this parameter
has no effect.
In LocalRemote.run(), pass this parameter to LocalRemoteProcess.wait()
and from this method pass it to subprocess.Popen.communicate(). Thus,
command will be terminated by subprocess module at seconds specified by
"timeout" parameter. IOW, "timeout" parameter will have an effect.
Fixes: https://tracker.ceph.com/issues/65533
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
x
between the brackets:[x]
. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this please
jenkins test classic perf
jenkins test crimson perf
jenkins test signed
jenkins test make check
jenkins test make check arm64
jenkins test submodules
jenkins test dashboard
jenkins test dashboard cephadm
jenkins test api
jenkins test docs
jenkins render docs
jenkins test ceph-volume all
jenkins test ceph-volume tox
jenkins test windows
jenkins test rook e2e