Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"collie node list" cannot list the other server #40

Closed
turbo73 opened this issue May 13, 2013 · 13 comments
Closed

"collie node list" cannot list the other server #40

turbo73 opened this issue May 13, 2013 · 13 comments

Comments

@turbo73
Copy link

turbo73 commented May 13, 2013

as the topic,i have 2 server in my LAN(all installed sheepdog),but one can not see the other in "collie node list" , like this:

[root@kvm-1 ~]# collie node list
M Id Host:Port V-Nodes Zone

  • 0 172.18.11.192:7000 64-1073016148

[root@kvm-2 ~]# collie node list
M Id Host:Port V-Nodes Zone

  • 0 172.18.11.198:7000 64 -972352852

and i had stop my iptables & selinux, but it still just can see itself.. help me please,thanks a lot!

@mitake
Copy link
Contributor

mitake commented May 14, 2013

Can I see an output of "collie cluster info"?

BTW, we are using the mailing list: sheepdog-users@lists.wpkg.org for discussions related to problems of using sheepdog. I suggest posting the problem to the list.
You can subscribe to the list via this page: http://lists.wpkg.org/mailman/listinfo/sheepdog-users

Thanks,
Hitoshi

@turbo73
Copy link
Author

turbo73 commented May 14, 2013

[root@kvm-1 /]# collie cluster info
Cluster status: running
Cluster created at Tue May 14 16:27:22 2013
Epoch Time Version
2013-05-14 16:27:22 1 [172.18.11.192:7000]

[root@kvm-2 ~]# collie cluster info
Cluster status: running
Cluster created at Tue May 14 16:27:58 2013
Epoch Time Version
2013-05-14 16:27:58 1 [172.18.11.198:7000]

OKAY,as above, what problem has i suffered ? thanks ~

@mitake
Copy link
Contributor

mitake commented May 14, 2013

It seems that you did "collie cluster format" on kvm-1 before the sheep on kvm-2 joins your cluster.

Correct operation seuqence is like this:

  1. execute sheep on kvm-1
  2. execute sheep on kvm-2
  3. collie cluster format (you can do this on both of kvm-1 and kvm-2 which you prefer)

Thanks,
Hitoshi

@turbo73
Copy link
Author

turbo73 commented May 14, 2013

[root@kvm-1 /]# service corosync start
[root@kvm-1 /]# sheep /var/lib/sheepdog/

[root@kvm-2 /]# service corosync start
[root@kvm-2 /]# sheep /var/lib/sheepdog/
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
[root@kvm-1 /]# collie cluster format

[root@kvm-2 /]# collie cluster format
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
thank you , i do like this,but the problem remains.. :(

@mitake
Copy link
Contributor

mitake commented May 14, 2013

You shouldn't format twice. You can format with single "collie cluster format" on any nodes.

  1. Do your corosyncs share same multicast address?
  2. Can I see an output result of "collie node list" direct after executing sheeps?

@turbo73
Copy link
Author

turbo73 commented May 14, 2013

[root@kvm-1 /]# grep mcast /etc/corosync/corosync.conf
mcastaddr: 226.94.1.1
mcastport: 5405
[root@kvm-1 /]# service corosync restart
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@kvm-1 /]# service sheepdog restart
Starting Sheepdog QEMU/KVM Block Storage (sheep): [ OK ]
[root@kvm-1 /]# collie cluster format --copies=2
using backend farm store
[root@kvm-1 /]# collie node list
M Id Host:Port V-Nodes Zone
0 172.18.11.192:7000 64-1073016148

[root@kvm-2 /]# grep mcast /etc/corosync/corosync.conf
mcastaddr: 226.94.1.1
mcastport: 5405
[root@kvm-2 /]# service corosync restart
Signaling Corosync Cluster Engine (corosync) to terminate: [ OK ]
Waiting for corosync services to unload:. [ OK ]
Starting Corosync Cluster Engine (corosync): [ OK ]
[root@kvm-2 /]# service sheepdog restart
Starting Sheepdog QEMU/KVM Block Storage (sheep): [ OK ]
[root@kvm-2 /]# collie node list
M Id Host:Port V-Nodes Zone
0 172.18.11.198:7000 64 -972352852

as you see and i followed your suggestion , and i do format in one node.
and the "bindnetaddr" in /etc/corosync/corosync.conf are all "172.18.11.0"

@mitake
Copy link
Contributor

mitake commented May 20, 2013

Sorry for my late reply. Now I'm busy with other work so I'll reply to this topic later (maybe few days).

@turbo73
Copy link
Author

turbo73 commented May 22, 2013

Okay, thanks

@mitake
Copy link
Contributor

mitake commented May 27, 2013

Can I see outputs of "collie cluster info" of kvm-1 and kvm-2?
There would be a possibility that you formatted sheeps twice.

@turbo73
Copy link
Author

turbo73 commented May 29, 2013

[root@kvm-2 ~]# collie cluster info
Cluster status: IO has halted as there are too few living nodes
Cluster created at Wed May 29 19:08:10 2013
Epoch Time Version
2013-05-29 19:08:11 1 [172.18.11.195:7000]

now I start corosync and sheepdog process in kvm-2 first,and then start corosync in kvm-1 ,but I cannot start sheepdog in kvm-1.
when I start sheepdog , the sheepdog log in kvm-2 shows:
[root@kvm-2 ~]# tailf /var/lib/sheepdog/sheep.log
May 29 19:04:20 [main] cluster_sanity_check(517) joining node ctime doesn't match: 5878591506144743224 vs 5878583848436663456
May 29 19:04:20 [main] sd_check_join_cb(1042) 172.18.11.194:7000: ret = 0x1, cluster_status = 0x1

It shows that probably is joining node ctime doesn't match, what is it mean?
thanks.

@mitake
Copy link
Contributor

mitake commented Jul 9, 2013

Very sorry for my late reply :(

ctimie mismatching means that two sheeps are belonging to different clusters.
If you format cluster correctly, these values must be identical.

Could you format your cluster after checking that "collie node list" outputs like below?
M Id Host:Port V-Nodes Zone

  • 0 127.0.0.1:7000 64 16777343
  • 1 127.0.0.1:7001 64 16777343

@vtolstov
Copy link
Contributor

vtolstov commented Aug 7, 2015

@mitake as for my testing (i'm use corosync 2.x) this not happening, may be we need close this outdated issue?

@mitake
Copy link
Contributor

mitake commented Aug 8, 2015

thanks for checking, I'm closing it.

@mitake mitake closed this as completed Aug 8, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants