Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI job passed and test script exit 0, but failed by timeout #186

Open
yih-redhat opened this issue May 13, 2024 · 3 comments
Open

CI job passed and test script exit 0, but failed by timeout #186

yih-redhat opened this issue May 13, 2024 · 3 comments

Comments

@yih-redhat
Copy link

Type of issue

None

Description

This bug is as same as #166, as it was closed and I cannot reopen it, so created a new bug to track this.

Descripion:

  1. I have a pull request Test testing-farm v2 yih-redhat/tmt-demo#42 that runs all test cases in testing-farm with v2.
  2. In this pull request, the sub job "Testing Farm - edge-9to9-9.4" is very strange, the test script is passed and exit with 0, but testing-farm plugin always report timetout error. Job link is https://artifacts.osci.redhat.com/testing-farm/befe8230-0cca-4417-816c-af13e20f564f/
  3. The sub job "Testing Farm - edge-8to9-9.4" has the same issue. And in this job, I checked all leftover processes in vm that may cause the timeout bug and printed them out in log, job link is https://artifacts.osci.redhat.com/testing-farm/24f28bf8-1c7d-47d5-9779-63723ecfb222/
  4. All sub jobs running in this pull request has same configuration. but only "Testing Farm - edge-9to9-9.4" and "Testing Farm - edge-8to9-9.4" has this strange timeout issue. Which means there might be something in the test scripts that caused this issue but not the configuration. The test script for these two sub jobs are https://github.com/yih-redhat/tmt-demo/blob/main/ostree-9-to-9.sh and https://github.com/yih-redhat/tmt-demo/blob/main/ostree-8-to-9.sh, but I cannot see anything special in these scripts, they are just normal shell scripts, like other test scripts in my repo.

Reproducer

No response

@jamacku
Copy link
Member

jamacku commented May 13, 2024

I would suggest you to increase the timeout. test run for 9000s ~ 150min

Maximum test time '150m' exceeded.
Adjust the test 'duration' attribute if necessary.
https://tmt.readthedocs.io/en/stable/spec/tests.html#duration

@yih-redhat
Copy link
Author

If you look into the log, you can see the test script was actually passed and exit with 0, but it looks like some child process blocked the job to complete until timeout.
I have tried to set the timeout to a very long time, and still got this issue. And with the same timeout value, other sub jobs which take much longer than this script can pass.

@yih-redhat
Copy link
Author

@jamacku Could you please take a look of this bug?
Because of this bug, I cannot get green in our CI job, and need to check it manually to see it passed or not.
This bug only happens on these two sub jobs, no matter how long I set the timeout, it will always exit 0 and then timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants