Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-4153. Increase default timeout in kubernetes tests #1357

Merged
merged 1 commit into from Aug 27, 2020

Conversation

elek
Copy link
Member

@elek elek commented Aug 26, 2020

What changes were proposed in this pull request?

Kubernetes tests are timing out sometimes. (eg. here: https://github.com/elek/ozone-build-results/tree/master/2020/08/26/2562/kubernetes)

Based on the log, SCM couldn't move out from safe mode. It's either a real issue or github environment is slow sometimes.

To make it clear what is the problem I propose to increase the default timeout from 90 sec to 300 sec (5 min).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-4153

How was this patch tested?

CI (did the same two days ago in https://github.com/elek/ozone-perf-env)

@codecov-commenter
Copy link

Codecov Report

Merging #1357 into master will increase coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1357      +/-   ##
============================================
+ Coverage     74.35%   74.40%   +0.04%     
- Complexity    10319    10335      +16     
============================================
  Files           985      985              
  Lines         50534    50556      +22     
  Branches       4934     4939       +5     
============================================
+ Hits          37577    37617      +40     
+ Misses        10591    10573      -18     
  Partials       2366     2366              
Impacted Files Coverage Δ Complexity Δ
...e/commandhandler/CreatePipelineCommandHandler.java 81.25% <0.00%> (-10.42%) 8.00% <0.00%> (ø%)
...apache/hadoop/hdds/scm/block/BlockManagerImpl.java 70.17% <0.00%> (-5.27%) 20.00% <0.00%> (ø%)
...iner/common/statemachine/SCMConnectionManager.java 77.38% <0.00%> (-3.58%) 13.00% <0.00%> (-1.00%)
...op/ozone/container/common/impl/HddsDispatcher.java 76.29% <0.00%> (-2.23%) 76.00% <0.00%> (-2.00%)
.../common/states/endpoint/HeartbeatEndpointTask.java 69.81% <0.00%> (-1.26%) 25.00% <0.00%> (-1.00%)
...doop/ozone/container/keyvalue/KeyValueHandler.java 67.25% <0.00%> (-0.89%) 69.00% <0.00%> (ø%)
...hadoop/ozone/om/ratis/OzoneManagerRatisServer.java 79.37% <0.00%> (-0.78%) 35.00% <0.00%> (-1.00%)
...ne/container/common/statemachine/StateContext.java 85.71% <0.00%> (-0.55%) 57.00% <0.00%> (-1.00%)
.../apache/hadoop/ozone/om/OmMetadataManagerImpl.java 81.70% <0.00%> (-0.52%) 96.00% <0.00%> (-2.00%)
.../org/apache/hadoop/hdds/scm/pipeline/Pipeline.java 85.84% <0.00%> (-0.46%) 48.00% <0.00%> (ø%)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c656feb...2cb903d. Read the comment docs.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @elek, I think this will help mitigate intermittent failures. The check currently takes around 20 minutes for a complete build and 4 tests. With this change in the worst case it would take 14 minutes more than earlier, which is still far from the longest check (around 1 hour).

@adoroszlai adoroszlai merged commit 2f3edd9 into apache:master Aug 27, 2020
rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
errose28 added a commit to errose28/ozone that referenced this pull request Sep 11, 2020
* master: (26 commits)
  HDDS-4167. Acceptance test logs missing if fails during cluster startup (apache#1366)
  HDDS-4121. Implement OmMetadataMangerImpl#getExpiredOpenKeys. (apache#1351)
  HDDS-3867. Extend the chunkinfo tool to display information from all nodes in the pipeline. (apache#1154)
  HDDS-4077. Incomplete OzoneFileSystem statistics (apache#1329)
  HDDS-3903. OzoneRpcClient support batch rename keys. (apache#1150)
  HDDS-4151. Skip the inputstream while offset larger than zero in s3g (apache#1354)
  HDDS-4147. Add OFS to FileSystem META-INF (apache#1352)
  HDDS-4137. Turn on the verbose mode of safe mode check on testlib (apache#1343)
  HDDS-4146. Show the ScmId and ClusterId in the scm web ui. (apache#1350)
  HDDS-4145. Bump version to 1.1.0-SNAPSHOT on master (apache#1349)
  HDDS-4109. Tests in TestOzoneFileSystem should use the existing MiniOzoneCluster (apache#1316)
  HDDS-4149. Implement OzoneFileStatus#toString (apache#1356)
  HDDS-4153. Increase default timeout in kubernetes tests (apache#1357)
  HDDS-2411. add a datanode chunk validator fo datanode chunk generator (apache#1312)
  HDDS-4140. Auto-close /pending pull requests after 21 days of inactivity (apache#1344)
  HDDS-4152. Archive container logs for kubernetes check (apache#1355)
  HDDS-4056. Convert OzoneAdmin to pluggable model (apache#1285)
  HDDS-3972. Add option to limit number of items displaying through ldb tool. (apache#1206)
  HDDS-4068. Client should not retry same OM on network connection failure (apache#1324)
  HDDS-4062. Non rack aware pipelines should not be created if multiple racks are alive. (apache#1291)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants