Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-3423. Enabling TestContainerReplicationEndToEnd and addressing f… #1260

Merged
merged 3 commits into from Jul 30, 2020

Conversation

prashantpogde
Copy link
Contributor

What changes were proposed in this pull request?

Enable TestContainerReplicationEndToEnd test cases and address failures

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-3423

How was this patch tested?

Running TestContainerReplicationEndToEnd multiple times.

@swagle swagle requested a review from adoroszlai July 27, 2020 06:52
@prashantpogde
Copy link
Contributor Author

Acceptance test failures are not related with the changes here.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prashantpogde for working on this.

I ran 50 iterations: the cluster failed to exit safe mode in one of them, the other 49 were OK. I think it may have to do with container report and stale node interval being the same 2 seconds.

2020-07-28 11:40:27,365 [IPC Server handler 0 on default port 35195] INFO  node.SCMNodeManager (SCMNodeManager.java:register(273)) - Registered Data node : 63530f9c-d650-4274-a064-303d95f9c0a1{ip: 172.17.0.2, host: 9c7301d26f17, networkLocation: /default-rack, certSerialId: null}
...
2020-07-28 11:40:29,375 [EventQueue-StaleNodeForStaleNodeHandler] INFO  node.StaleNodeHandler (StaleNodeHandler.java:onMessage(58)) - Datanode 63530f9c-d650-4274-a064-303d95f9c0a1{ip: 172.17.0.2, host: 9c7301d26f17, networkLocation: /default-rack, certSerialId: null} moved to stale state. Finalizing its pipelines [PipelineID=2f3f44d7-dff4-4ff3-9573-6e8f31223305, PipelineID=78a8416c-931a-4824-98be-73699afbe24d]

https://github.com/adoroszlai/hadoop-ozone/runs/918556614

Comment on lines 172 to 173
LoggerFactory.getLogger(TestContainerReplicationEndToEnd.class).info(
"Current Container State is " + containerState);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use placeholder in log message instead of + to avoid new code warning (sonar).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Also changed the intervals you mentioned to maintain the same ratio as their default values. for container-report-intrval : stalenode-interval : deadnode-interval.

@prashantpogde
Copy link
Contributor Author

prashantpogde commented Jul 29, 2020

Failing test TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBufferWithMixOfTransactions is unrelated with the changes here.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prashantpogde for updating the patch. 50/50 tests passed.

@adoroszlai adoroszlai merged commit 93ac9ac into apache:master Jul 30, 2020
errose28 pushed a commit to errose28/ozone that referenced this pull request Jul 31, 2020
* master: (55 commits)
  HDDS-4052. Remove master/slave terminology from Ozone (apache#1281)
  HDDS-4047. OzoneManager met NPE exception while getServiceList (apache#1277)
  HDDS-3990. Test Kubernetes examples with acceptance tests (apache#1223)
  HDDS-4045. Add more ignore rules to the RAT ignore list (apache#1273)
  HDDS-3970. Enabling TestStorageContainerManager with all failures addressed (apache#1257)
  HDDS-4033. Make the acceptance test reports hierarchical (apache#1263)
  HDDS-3423. Enabling TestContainerReplicationEndToEnd and addressing failures (apache#1260)
  HDDS-4027. Suppress ERROR message when SCM attempt to create additional pipelines. (apache#1265)
  HDDS-4024. Avoid while loop too soon when exception happen (apache#1253)
  HDDS-3809. Make number of open containers on a datanode a function of no of volumes reported by it. (apache#1081)
  HDDS-4019. Show the storageDir while need init om or scm (apache#1248)
  HDDS-3511. Fix javadoc comment in OmMetadataManager (apache#1247)
  HDDS-4041. Ozone /conf endpoint triggers kerberos replay error when SPNEGO is enabled. (apache#1267)
  HDDS-4031. Run shell tests in CI (apache#1261)
  HDDS-4038. Eliminate GitHub check warnings (apache#1268)
  HDDS-4011. Update S3 related documentation. (apache#1245)
  HDDS-4030. Remember the selected columns and make the X-axis scrollable in recon datanodes UI (apache#1259)
  HDDS-4032. Run author check without docker (apache#1262)
  HDDS-4026. Dir rename failed when sets 'ozone.om.enable.filesystem.paths' to true (apache#1256)
  HDDS-4017. Acceptance check may run against wrong commit (apache#1249)
  ...
rakeshadr pushed a commit to rakeshadr/hadoop-ozone that referenced this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants