Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gossip registry #404

Closed
vtolstov opened this issue Jan 29, 2019 · 20 comments
Closed

gossip registry #404

vtolstov opened this issue Jan 29, 2019 · 20 comments

Comments

@vtolstov
Copy link
Contributor

I'm experiment with gossip registry and may be found some specific issue. When i'm start first service1 with
specify gossip.Address("172.16.1.254:4223") and on other server service2 with registry.Addrs("172.16.1.254:4223") and gossip.Address("172.16.1.1:0")

i see that members equal 2 on both sides, but the second service does not registered in registry.
But if i stop service1 and start it with registry.Addrs("172.16.1.254:4223") and gossip.Address("172.16.1.1:xxx") where xxx is port provided by service2 all works fine.
So issue appears only on first service1 when it start first without other members.

@vtolstov
Copy link
Contributor Author

also i don't see broadcast update messages in the first case

@vtolstov
Copy link
Contributor Author

also then the service connected to already running registry service it receives updates from it, but i don't see any updates send to first service

@vtolstov
Copy link
Contributor Author

root case of the issue, because service2 then connect broadcast sync event, and serivce2 receives all data from service1, but service1 don't receive any service info from service2

@vtolstov
Copy link
Contributor Author

and LocalState func for connected service does not have any services data in channel, because registry not created when gossip join happening.

@vtolstov
Copy link
Contributor Author

next investigation:

g.queue.QueueBroadcast(&broadcast{
  update: up,
  notify: nil,
})

this is not send service data when calls Register in gossip.
I'm check this by providing channel to notify and check when read from it returned.

service1 that start first after boadcast returns, but service2 not.

@vtolstov
Copy link
Contributor Author

@asim , gentle ping

@asim
Copy link
Member

asim commented Jan 30, 2019

I do not have time to investigate this right now. Feel free to PR a fix.

@vtolstov
Copy link
Contributor Author

nice, i think that enterprise SLA helps with such cases, can you write in enterprise repo you test system , so that other can understand risks and what you autotest for each commit?

@vtolstov
Copy link
Contributor Author

now i have only one workaround, remove check for join in LocalState and in MergeRemoteState
so after first pull/push service data updated on both sides.But this is very ugly. As i see in all cases broadcast not worked for me.

@vtolstov
Copy link
Contributor Author

I'm write test case for gossip registry. And it works fine, also i'm try to run two micro services with the same registry params and service info not propagated to to each other. Does it possible that some issue present in micro/server code?

@vtolstov
Copy link
Contributor Author

@vtolstov
Copy link
Contributor Author

am add to https://github.com/unistack-org/go-micro/blob/gossip/registry/gossip/gossip_test.go failed test case.
Can you look @asim and say, whats wrong in TestServerRegistry ?

@vtolstov
Copy link
Contributor Author

i found!
Does it possible to add to server some option to not return to channel something when it fully started?
Main problem that sometimes server started too quickly and not register in registry all the stuff.

So i'm check in go-micro repo file service_test.go and you use WaitGroup in After start to allow wait then server is fully started.

@vtolstov
Copy link
Contributor Author

and this is not works for real world example
node1

./tests --registry_address 172.16.1.254:4223 --broker_endpoint 172.16.1.254:4222   --dns_address 0.0.0.0:5353
2019/01/31 16:55:26 Registry Listening on 172.16.1.254:4223
2019/01/31 16:55:26 run org.unistack.sshkey
2019/01/31 16:55:26 Transport [http] Listening on [::]:38785
2019/01/31 16:55:26 Broker [stan] Listening on nats://172.16.1.254:4222
2019/01/31 16:55:26 Registering node: org.unistack.sshkey-23fb4462-ca7e-4910-a5f1-a48cd475c7bc
2019/01/31 16:55:31 total svcs 1
2019/01/31 16:55:31 svc: org.unistack.sshkey
2019/01/31 16:55:36 total svcs 1
2019/01/31 16:55:36 svc: org.unistack.sshkey
2019/01/31 16:55:38 [DEBUG] memberlist: Stream connection from=172.16.1.254:42218
2019/01/31 16:55:41 total svcs 1
2019/01/31 16:55:41 svc: org.unistack.sshkey
2019/01/31 16:55:46 total svcs 1
2019/01/31 16:55:46 svc: org.unistack.sshkey
2019/01/31 16:55:48 [DEBUG] memberlist: Stream connection from=172.16.1.1:60186
2019/01/31 16:55:51 total svcs 1
2019/01/31 16:55:51 svc: org.unistack.sshkey
2019/01/31 16:55:56 total svcs 1
2019/01/31 16:55:56 svc: org.unistack.sshkey
2019/01/31 16:56:01 total svcs 1
2019/01/31 16:56:01 svc: org.unistack.sshkey
2019/01/31 16:56:06 total svcs 1
2019/01/31 16:56:06 svc: org.unistack.sshkey
2019/01/31 16:56:09 [DEBUG] memberlist: Initiating push/pull sync with: 172.16.1.254:40007
2019/01/31 16:56:11 total svcs 1
2019/01/31 16:56:11 svc: org.unistack.sshkey

node2:

2019/01/31 16:56:18 svc: org.unistack.sshkey
2019/01/31 16:56:22 [DEBUG] memberlist: Initiating push/pull sync with: 172.16.1.1:39237
2019/01/31 16:56:23 total svcs 2
2019/01/31 16:56:23 svc: org.unistack.libvirt
2019/01/31 16:56:23 svc: org.unistack.sshkey
2019/01/31 16:56:28 total svcs 2
2019/01/31 16:56:28 svc: org.unistack.libvirt
2019/01/31 16:56:28 svc: org.unistack.sshkey
2019/01/31 16:56:33 total svcs 2
2019/01/31 16:56:33 svc: org.unistack.libvirt
2019/01/31 16:56:33 svc: org.unistack.sshkey
2019/01/31 16:56:38 total svcs 2
2019/01/31 16:56:38 svc: org.unistack.libvirt
2019/01/31 16:56:38 svc: org.unistack.sshkey
2019/01/31 16:56:39 [DEBUG] memberlist: Stream connection from=172.16.1.254:45240
2019/01/31 16:56:39 [DEBUG] memberlist: Stream connection from=172.16.1.1:35506
2019/01/31 16:56:43 total svcs 2
2019/01/31 16:56:43 svc: org.unistack.libvirt
2019/01/31 16:56:43 svc: org.unistack.sshkey
2019/01/31 16:56:48 total svcs 2
2019/01/31 16:56:48 svc: org.unistack.libvirt
2019/01/31 16:56:48 svc: org.unistack.sshkey

@vtolstov
Copy link
Contributor Author

@asim i think that gossip registry must be die.
Do you know that broadcast doing by udp, and so packet size limits to something like 1440 byte? Most of my services when marshal to json as you do inside gossip registry takes from 6000 to 15000, 23000...

In case of mdns registry you don't expose all data like in gossip.

@vtolstov
Copy link
Contributor Author

i'm try to minimize sended data, but most of the time endpoint is too big.
example

{"name":"org.unistack.sshkey","version":"0.0.0.1","metadata":null,"endpoints":[{"name":"SshkeyService.Create","request":{"name":"Ssh
keyCreateReq","type":"SshkeyCreateReq","values":[{"name":"name","type":"string","values":null},{"name":"data","type":"string","values":n
ull},{"name":"project","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"-","type":"","values":nu
ll},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"
response":{"name":"Sshkey","type":"Sshkey","values":[{"name":"uuid","type":"string","values":null},{"name":"account","type":"string","va
lues":null},{"name":"project","type":"string","values":null},{"name":"data","type":"string","values":null},{"name":"name","type":"string
","values":null},{"name":"fprint_md5","type":"string","values":null},{"name":"fprint_sha256","type":"string","values":null},{"name":"cre
ated_at","type":"int64","values":null},{"name":"updated_at","type":"int64","values":null},{"name":"enabled","type":"uint32","values":nul
l},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"
-","type":"int32","values":null}]},"metadata":{"stream":"false"}},{"name":"SshkeyService.Delete","request":{"name":"SshkeyDeleteReq","ty
pe":"SshkeyDeleteReq","values":[{"name":"uuid","type":"string","values":null},{"name":"project","type":"string","values":null},{"name":"
account","type":"string","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","typ
e":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":{"name":"Empty","type":"Empty","values":[{"name":"-",
"type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int3
2","values":null}]},"metadata":{"stream":"false"}},{"name":"SshkeyService.List","request":{"name":"SshkeyListReq","type":"SshkeyListReq"
,"values":[{"name":"project","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"fields","type":"[]
SshkeyListReq_Fields","values":[{"name":"SshkeyListReq_Fields","type":"SshkeyListReq_Fields","values":null}]},{"name":"meta","type":"Ssh
keyListReq_Meta","values":[{"name":"limit","type":"uint32","values":null},{"name":"offset","type":"uint32","values":null},{"name":"sort"
,"type":"string","values":null},{"name":"order","type":"string","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"
[]uint8","values":null},{"name":"-","type":"int32","values":null}]},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","v
alues":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":{"name":"SshkeyListRsp","
type":"SshkeyListRsp","values":[{"name":"sshkeys","type":"[]Sshkey","values":[{"name":"Sshkey","type":"Sshkey","values":null}]},{"name":
"meta","type":"SshkeyListRsp_Meta","values":[{"name":"total","type":"int64","values":null},{"name":"-","type":"","values":null},{"name":
"-","type":"[]uint8","values":null},{"name":"-","type":"int32","values":null}]},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"metadata":{"stream":"false"}},{"name":"SshkeyService.Lookup","request":{"name":"SshkeyLookupReq","type":"SshkeyLookupReq","values":[{"name":"uuid","type":"string","values":null},{"name":"project","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"fields","type":"[]SshkeyLookupReq_Fields","values":[{"name":"SshkeyLookupReq_Fields","type":"SshkeyLookupReq_Fields","values":null}]},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":{"name":"Sshkey","type":"Sshkey","values":[{"name":"uuid","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"project","type":"string","values":null},{"name":"data","type":"string","values":null},{"name":"name","type":"string","values":null},{"name":"fprint_md5","type":"string","values":null},{"name":"fprint_sha256","type":"string","values":null},{"name":"created_at","type":"int64","values":null},{"name":"updated_at","type":"int64","values":null},{"name":"enabled","type":"uint32","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","valus":null},{"name":"created_at","type":"int64","values":null},{"name":"updated_at","type":"int64","values":null},{"name":"enabled","type":"uint32","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"metadata":{"stream":"false"}},{"name":"SshkeyService.Search","request":{"name":"SshkeySearchReq","type":"SshkeySearchReq","values":[{"name":"project","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":{"name":"SshkeyListRsp","type":"SshkeyListRsp","values":[{"name":"sshkeys","type":"[]Sshkey","values":[{"name":"Sshkey","type":"Sshkey","values":null}]},{"name":"meta","type":"SshkeyListRsp_Meta","values":[{"name":"total","type":"int64","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":null},{"name":"-","type":"int32","values":null}]},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"metadata":{"stream":"false"}},{"name":"SshkeyService.Update","request":{"name":"SshkeyUpdateReq","type":"SshkeyUpdateReq","values":[{"name":"uuid","type":"string","values":null},{"name":"name","type":"string","values":null},{"name":"project","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"fields","type":"FieldMask","values":[{"name":"paths","type":"[]string","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":null},{"name":"-","type":"int32","values":null}]},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":{"name":"Sshkey","type":"Sshkey","values":[{"name":"uuid","type":"string","values":null},{"name":"account","type":"string","values":null},{"name":"project","type":"string","values":null},{"name":"data","type":"string","values":null},{"name":"name","type":"string","values":null},{"name":"fprint_md5","type":"string","values":null},{"name":"fprint_sha256","type":"string","values":null},{"name":"created_at","type":"int64","values":null},{"name":"updated_at","type":"int64","values":null},{"name":"enabled","type":"uint32","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"metadata":{"stream":"false"}},{"name":"Func","request":{"name":"Account","type":"Account","values":[{"name":"uuid","type":"string","values":null},{"name":"owner","type":"string","values":null},{"name":"zone","type":"string","values":null},{"name":"status","type":"string","values":null},{"name":"login","type":"string","values":null},{"name":"passw","type":"string","values":null},{"name":"perms","type":"string","values":null},{"name":"type","type":"string","values":null},{"name":"created_at","type":"int64","values":null},{"name":"updated_at","type":"int64","values":null},{"name":"settings","type":"string","values":null},{"name":"-","type":"","values":null},{"name":"-","type":"[]uint8","values":[{"name":"uint8","type":"uint8","values":null}]},{"name":"-","type":"int32","values":null}]},"response":null,"metadata":{"subscriber":"true","topic":"org.unistack.account"}}],"nodes":null}

@vtolstov
Copy link
Contributor Author

also you mdns register also broken, because you pass endpoint in TXT record, that have limit 255 bytes as of RFC 4408

@vtolstov
Copy link
Contributor Author

vtolstov commented Feb 1, 2019

yes, txt records can be concatenated, but this is also limits to udp packet size

@asim
Copy link
Member

asim commented Feb 1, 2019

If you have a good solution please propose or PR it. Otherwise you can disable adding endpoints when you register handlers https://godoc.org/github.com/micro/go-micro/server#InternalHandler

@vtolstov
Copy link
Contributor Author

vtolstov commented Feb 2, 2019

close as #411 merged

@vtolstov vtolstov closed this as completed Feb 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants