-
Notifications
You must be signed in to change notification settings - Fork 502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-3281. Add timeouts to all robot tests #723
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hanishakoneru for working on this. We can try adding these timeouts, but I'm not sure it will really help.
I performed a small experiment: added "10 seconds" timeout and sleep 30
. Based on Robot logs, it seems the test was marked as failed after 10 seconds, but the command was not interrupted and still took 30 seconds to execute:
<msg timestamp="20200326 11:52:18.496" level="INFO">Running command 'sleep 30 2>&1'.</msg>
<msg timestamp="20200326 11:52:28.347" level="FAIL">Test timeout 10 seconds exceeded.</msg>
<status status="FAIL" endtime="20200326 11:52:28.347" starttime="20200326 11:52:18.495"></status>
</kw>
<status status="FAIL" endtime="20200326 11:52:48.466" starttime="20200326 11:52:18.495"></status>
</kw>
@adoroszlai, I think even with that limitation the timeout will help us isolate the problem. Let's say the acceptance suit is cancelled, we could still get to know which test contributed to the time out. |
323260a
to
a7341e1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There are two timeouts:
As far as I understood @adoroszlai warned us that even if we have a test level timeout it doesn't help at all, if 2nd is not in place. If one But I agree even without 2nd, it's good to have this patch. On the other hand, I tested it with sleep, and it seems to be working for me...
As you see my sleep command was killed after 20 seconds. Note: originally I suggested to put the |
Yes. Learned that robot framework does not allow "global timeout" by design. @adoroszlai, @elek are we good to merge this patch? |
Yes, thanks. |
Thank you all for the reviews. Will merge this PR. |
@elek @hanishakoneru a 6-hours run despite timeout settings: https://github.com/apache/hadoop-ozone/runs/548358523 |
This branch was not rebased to include the timeout change. Can we rebase and retry. |
This run is a push build for 94413cd on |
Bot not in |
Good catch. But can you please explain why
Compare this to a normal run:
|
@adoroszlai , @elek how about we print a start and end time (or duration) for each test suit. |
Just saw that there is an option to list the timestamp of each message.
|
You are right it seems to be the freon. But in this case:
|
What changes were proposed in this pull request?
We have seen in some CI runs that the acceptance test suit is getting cancelled as it runs for more than 6 hours. Because of this, the test results and logs are also not saved.
This Jira aims to add a 5 minute timeout to all robot tests. In case some tests require more time, we can update the timeout. This would help to isolate the test which could be causing the whole acceptance test suit to time out.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-3281
How was this patch tested?
CI acceptance test suit can test this change as it only adds a timeout to robot tests.