Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-5723. Increase time limit of Ozone acceptance tests. #2620

Merged
merged 2 commits into from Sep 9, 2021

Conversation

errose28
Copy link
Contributor

@errose28 errose28 commented Sep 7, 2021

What changes were proposed in this pull request?

Increase acceptance test timeout from 120 minutes to 150 minutes, since new upgrade/downgrade acceptance tests sometimes push the acceptance test run time over 2 hours.

What is the link to the Apache JIRA

HDDS-5723

How was this patch tested?

  • Existing acceptance test run.

Copy link
Contributor

@avijayanhwx avijayanhwx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errose28 Can we remove the Todo in the Line #264 as well? @sodonnel is already working on reducing CI time for ITs. We can also close HDDS-5270 as dup, and create a new one for acceptance tests.

@errose28
Copy link
Contributor Author

errose28 commented Sep 8, 2021

I had initially moved the TODO comment since it corresponded to integration tests, not acceptance tests. I will remove it instead. HDDS-5270 has been resolved as a duplicate and we can track speeding up acceptance tests in HDDS-5730.

@avijayanhwx
Copy link
Contributor

@errose28 Can you confirm if the test failure is unrelated?

@errose28
Copy link
Contributor Author

errose28 commented Sep 8, 2021

Test failure is unrelated. Looks like CI failed to start a mini ozone cluster.

From org.apache.hadoop.ozone.scm.TestSCMInstallSnapshotWithHA.txt in the log bundle, the timeout happened here:

	at org.apache.ozone.test.GenericTestUtils.waitFor(GenericTestUtils.java:225)
	at org.apache.hadoop.ozone.MiniOzoneClusterImpl.waitForClusterToBeReady(MiniOzoneClusterImpl.java:228)
	at org.apache.hadoop.ozone.scm.TestSCMInstallSnapshotWithHA.init(TestSCMInstallSnapshotWithHA.java:104)

Logs show the cause of the timeout was a networking issue:
Caused by: java.net.ConnectException: Connection refused

Copy link
Contributor

@bharatviswa504 bharatviswa504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

@bharatviswa504
Copy link
Contributor

Test failure is not related to PR. Going a head with commit, as failed test is totally unrelated to this. In this way we can save one CI run resources

@bharatviswa504 bharatviswa504 merged commit ee993d0 into apache:master Sep 9, 2021
@bharatviswa504
Copy link
Contributor

Thank You @errose28 for the fix and everyone for the review

errose28 added a commit to errose28/ozone that referenced this pull request Sep 10, 2021
* master: (21 commits)
  HDDS-5502. [OFS] URI parser throws URISyntaxException when path contains space (apache#2500)
  HDDS-5715. Make XceiverServerRatis#raftGids a thread-safe set. (apache#2613)
  HDDS-5699. Added Log to show why a container was marked UNHEALTHY. (apache#2627)
  HDDS-5723. Increase time limit of Ozone acceptance tests. (apache#2620)
  HDDS-5718. Refactor TestXceiverClientManager to reuse mini-clusters (apache#2616)
  HDDS-5724. Add RaftpeerId when getting scm roles (apache#2622)
  HDDS-5711. support -1 for running balancer infinitely (apache#2621)
  HDDS-5670. ContainerBalancer should get OzoneConfiguration from ContainerBalancerConfiguration. (apache#2577)
  HDDS-5638. Fix docker-compose to make Recon come up. (apache#2563)
  HDDS-5726. Skip remove for already removed pipeline. (apache#2624)
  HDDS-5719. Reduce number of mini-clusters needed for decommission tests (apache#2617)
  HDDS-5716. Fix create key failure error log print (apache#2614)
  HDDS-5678. Handle unsecure SCM HA converted to secure SCM HA. (apache#2596)
  HDDS-5432. Enable downgrade testing after 1.1.0 release. (apache#2484)
  HDDS-5709. do not call removeTransactionsFromDB if nothing to remove (apache#2608)
  HDDS-5700. Improve LOG message of decommission progress. (apache#2598)
  HDDS-5690. Speed up TestContainerReplication by removing testSkipDemmissionAndMaintenanceNode (apache#2591)
  HDDS-5706. Fix ReplicationManager zero metrics for inflight actions. (apache#2605)
  HDDS-5667. documentation page layout (apache#2604)
  HDDS-5644. Speed up decommission tests using a background Mini Cluster provider (apache#2554)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants