Skip to content
This repository has been archived by the owner on Apr 4, 2023. It is now read-only.

WIP: Use hostnames rather than IP addresses for cassandra nodes #330

Closed
wants to merge 19 commits into from

Conversation

wallrj
Copy link
Member

@wallrj wallrj commented Apr 12, 2018

Fixes: #319

Release note:

NONE

@jetstack-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: munnerz

Assign the PR to them by writing /assign @munnerz in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@wallrj
Copy link
Member Author

wallrj commented Apr 12, 2018

Ok. This looks promising:

INFO  [main] 2018-04-12 12:59:15,418 MessagingService.java:753 - Starting Messaging Service on cass-test-np-region-1-zone-a-0/10.192.2.5:7000 (eth0)  
WARN  [main] 2018-04-12 12:59:15,423 SystemKeyspace.java:1089 - No host ID found, created 26dc77f9-9057-4151-a391-c8192f412fe4 (Note: This should happ
en exactly once per node).         

@wallrj
Copy link
Member Author

wallrj commented Apr 12, 2018

Using fully qualified domain names for nodes does work, to a point.
But if a node IP address changes, the new IP address doesn't get gossipped around the cluster, as I'd hoped.

  • Created a cluster of 5
cass-test-np-region-1-zone-a-0: 10.192.2.44
cass-test-np-region-1-zone-a-1: 10.192.3.14
cass-test-np-region-1-zone-a-2: 10.192.3.15
cass-test-np-region-1-zone-a-3: 10.192.2.45
cass-test-np-region-1-zone-a-4: 10.192.3.16
  • Deleted 3rd node
cass-test-np-region-1-zone-a-2   0/1       Terminating   0          8m
  • Wait for it to return
cass-test-np-region-1-zone-a-2   0/1       Init:0/1   0          0s
  • Now has a new IP address
cass-test-np-region-1-zone-a-0: 10.192.2.44
cass-test-np-region-1-zone-a-1: 10.192.3.14
cass-test-np-region-1-zone-a-2: 10.192.2.46
cass-test-np-region-1-zone-a-3: 10.192.2.45
cass-test-np-region-1-zone-a-4: 10.192.3.16
  • But it fails with
INFO  [main] 2018-04-12 20:43:50,310 StorageService.java:1442 - JOINING: Starting to bootstrap...
Exception (java.lang.RuntimeException) encountered during startup: A node required to move the data consistently is down (/10.192.3.15). If you wish to move the data from a potentially inconsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
java.lang.RuntimeException: A node required to move the data consistently is down (/10.192.3.15). If you wish to move the data from a potentially inco
nsistent replica, restart the node with -Dcassandra.consistent.rangemovement=false
        at org.apache.cassandra.dht.RangeStreamer.getAllRangesWithStrictSourcesFor(RangeStreamer.java:294)
        at org.apache.cassandra.dht.RangeStreamer.addRanges(RangeStreamer.java:177)
        at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:84)
        at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:1491)
        at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:966)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:681)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:612)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:393)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
  • nodetool status shows
richard@pet-instance-1:~/go/src/github.com/jetstack/navigator$ kubectl -n test-cassandra-1523564296-10880 exec cass-test-np-region-1-zone-a-0 -- /bin/sh -c 'JVM_OPTS="" nodetool status'
Datacenter: region-1
====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address      Load       Tokens       Owns (effective)  Host ID                               Rack
UN  10.192.3.16  98.17 KiB  256          41.0%             9982acb2-84c1-48b9-ad9b-192559a71525  zone-a
UN  10.192.2.44  108.59 KiB  256          38.2%             9e5fc55f-15fc-4083-955b-3e5257cbc806  zone-a
UN  10.192.2.45  89.5 KiB   256          40.4%             be98aa64-3777-4018-bee0-7fc26fd9109b  zone-a
UJ  10.192.2.46  115.16 KiB  256          ?                 1bcd0ce9-9c2e-4cee-8b07-179c2ea63323  zone-a
UN  10.192.3.14  108.36 KiB  256          41.8%             0069445d-ad81-4513-a040-1982cdd6a279  zone-a
DN  10.192.3.15  127.33 KiB  256          38.6%             7e0dc352-eddc-416c-a773-df5cf940f089  zone-a

@jetstack-bot
Copy link
Collaborator

@wallrj: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
navigator-e2e-v1-8 246f25c link /test e2e v1.8
navigator-e2e-v1-9 246f25c link /test e2e v1.9
navigator-e2e-v1-7 246f25c link /test e2e v1.7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@munnerz
Copy link
Contributor

munnerz commented Apr 17, 2018

Can this be closed in favour of #334?

@jetstack-bot
Copy link
Collaborator

@wallrj: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wallrj
Copy link
Member Author

wallrj commented May 8, 2018

This branch has some useful code for the linked issues above, but closing the PR in favour of #334 which implements the minimum changes.

@wallrj wallrj closed this May 8, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants