New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-4153. Increase default timeout in kubernetes tests #1357
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1357 +/- ##
============================================
+ Coverage 74.35% 74.40% +0.04%
- Complexity 10319 10335 +16
============================================
Files 985 985
Lines 50534 50556 +22
Branches 4934 4939 +5
============================================
+ Hits 37577 37617 +40
+ Misses 10591 10573 -18
Partials 2366 2366 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elek, I think this will help mitigate intermittent failures. The check currently takes around 20 minutes for a complete build and 4 tests. With this change in the worst case it would take 14 minutes more than earlier, which is still far from the longest check (around 1 hour).
* master: (26 commits) HDDS-4167. Acceptance test logs missing if fails during cluster startup (apache#1366) HDDS-4121. Implement OmMetadataMangerImpl#getExpiredOpenKeys. (apache#1351) HDDS-3867. Extend the chunkinfo tool to display information from all nodes in the pipeline. (apache#1154) HDDS-4077. Incomplete OzoneFileSystem statistics (apache#1329) HDDS-3903. OzoneRpcClient support batch rename keys. (apache#1150) HDDS-4151. Skip the inputstream while offset larger than zero in s3g (apache#1354) HDDS-4147. Add OFS to FileSystem META-INF (apache#1352) HDDS-4137. Turn on the verbose mode of safe mode check on testlib (apache#1343) HDDS-4146. Show the ScmId and ClusterId in the scm web ui. (apache#1350) HDDS-4145. Bump version to 1.1.0-SNAPSHOT on master (apache#1349) HDDS-4109. Tests in TestOzoneFileSystem should use the existing MiniOzoneCluster (apache#1316) HDDS-4149. Implement OzoneFileStatus#toString (apache#1356) HDDS-4153. Increase default timeout in kubernetes tests (apache#1357) HDDS-2411. add a datanode chunk validator fo datanode chunk generator (apache#1312) HDDS-4140. Auto-close /pending pull requests after 21 days of inactivity (apache#1344) HDDS-4152. Archive container logs for kubernetes check (apache#1355) HDDS-4056. Convert OzoneAdmin to pluggable model (apache#1285) HDDS-3972. Add option to limit number of items displaying through ldb tool. (apache#1206) HDDS-4068. Client should not retry same OM on network connection failure (apache#1324) HDDS-4062. Non rack aware pipelines should not be created if multiple racks are alive. (apache#1291) ...
What changes were proposed in this pull request?
Kubernetes tests are timing out sometimes. (eg. here: https://github.com/elek/ozone-build-results/tree/master/2020/08/26/2562/kubernetes)
Based on the log, SCM couldn't move out from safe mode. It's either a real issue or github environment is slow sometimes.
To make it clear what is the problem I propose to increase the default timeout from 90 sec to 300 sec (5 min).
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-4153
How was this patch tested?
CI (did the same two days ago in https://github.com/elek/ozone-perf-env)