Etcd service watcher #58

Open
wants to merge 6 commits into
from

Projects

None yet
@bobtfish
Contributor

And the etcd watcher part to go with the reporter :)

@calebbrown calebbrown referenced this pull request Aug 21, 2014
Open

etcd service watcher #35

@calebbrown calebbrown commented on an outdated diff Aug 21, 2014
lib/synapse/service_watcher/etcd.rb
@@ -0,0 +1,173 @@
+require "synapse/service_watcher/base"
+
+require 'etcd'
+
+# Monkeypatch till 91f9e72d6d57ae3760e9266835f404d986072590 gets to rubygems..
@calebbrown
calebbrown Aug 21, 2014

0.2.4 was just released and is available in Ruby Gems - so this could probably be cleaned up.

@SEJeff
SEJeff commented Aug 21, 2014

@bobtfish mind rebasing this to merge cleanly with master?

@tcolgate

Is this likely to merged soon? The nerve part already seems to be merged? etcd does seem like a better choice (nicer cli tools,and less of a total collapse if quorum is lost).

@gebrits
gebrits commented Nov 9, 2014

any progress on this?

@bobtfish
Contributor

AFAIK this is good to merge, and it's waiting on someone from airbnb to actually merge it, rather than me to do anything to make it mergeable. (If not, I've missed something - please point it out?)

@gebrits
gebrits commented Nov 10, 2014

@bobtfish Cheers. Hope this will be merged soon.

@bobtfish
Contributor

On Nov 18, 2014, at 8:54 AM, Zane notifications@github.com wrote:

Greetings -

I've implemented this watcher, but I am running into an error:

E, [2014-11-18T08:51:02.005658 #7449] ERROR -- Synapse::EtcdWatcher: synapse: invalid data in etcd node # at /service: 757: unexpected token at 'running' DATA running

For all entries in my etcd instance I get a return from synapse noted above.

Can you also show us what’s in etcd? (The key layout, and JSON contents)

I’m guessed that there is some somehow bad data that’s crashing things - if you can show me what it is, I’ll fix and add a test :)

Cheers
Tom

@tcolgate

For what it is worth, I've done some additional work on the etcd watcher,
to include the failover host support, and add some retry logic. I've not
tidied it up enough for a PR yet but code is on master here:

https://github.com/we7/synapse

On 19 November 2014 11:59, Tomas Doran notifications@github.com wrote:

On Nov 18, 2014, at 8:54 AM, Zane notifications@github.com wrote:

Greetings -

I've implemented this watcher, but I am running into an error:

E, [2014-11-18T08:51:02.005658 #7449] ERROR -- Synapse::EtcdWatcher:
synapse: invalid data in etcd node # at /service: 757: unexpected token at
'running' DATA running

For all entries in my etcd instance I get a return from synapse noted
above.

Can you also show us what’s in etcd? (The key layout, and JSON contents)

I’m guessed that there is some somehow bad data that’s crashing things -
if you can show me what it is, I’ll fix and add a test :)

Cheers
Tom


Reply to this email directly or view it on GitHub
#58 (comment).

Tristan Colgate-McFarlane

"You can get all your daily vitamins from 52 pints of guiness, and a
glass of milk"

@sepulworld

Thanks Tom. Yes, I wasn't using Nerve to populate host information in Etcd so the key values didn't match up with what the Etcd watcher on synapse was looking for. I am trying to switch over to using Nerve but etcd support doesn't seem to be a part of the current Gem version and doing a gem install_specific against the github repo doesn't work either. Do we know when the etcd functionality for Nerve will be stable?

@sepulworld

Hi Tom,

Here is my etcd key layout:

curl http://192.168.183.171:4001/v2/keys/service

{"action":"get","node":{"key":"/service","dir":true,"nodes":[{"key":"/service/192.168.186.158:49234","value":"running","expiration":"2014-11-22T21:17:37.738052203Z","ttl":18,"modifiedIndex":73811,"createdIndex":73811},{"key":"/service/192.168.186.158:49235","value":"running","expiration":"2014-11-22T21:17:38.321944617Z","ttl":18,"modifiedIndex":73812,"createdIndex":73812},{"key":"/service/192.168.186.158:49231","value":"running","expiration":"2014-11-22T21:17:36.733670014Z","ttl":17,"modifiedIndex":73809,"createdIndex":73809},{"key":"/service/192.168.186.158:49233","value":"running","expiration":"2014-11-22T21:17:36.878094294Z","ttl":17,"modifiedIndex":73810,"createdIndex":73810},{"key":"/service/192.168.186.158:49232","value":"running","expiration":"2014-11-22T21:17:36.517818371Z","ttl":17,"modifiedIndex":73808,"createdIndex":73808}],"modifiedIndex":3

@konsti
konsti commented Dec 22, 2014

@igor47 Any change this is merged with master?

@Zolmeister

status?

@SEJeff
SEJeff commented Feb 6, 2015

Well this has merge conflicts, so at a minimum needs to be rebased before it is merged.

@WooDzu
WooDzu commented Feb 19, 2015

Any update on this?

@tcolgate

I've dropped my version of this PR. We've migrated to consul.

On 19 February 2015 at 08:35, Piotr Gasiorowski notifications@github.com
wrote:

Any update on this?


Reply to this email directly or view it on GitHub
#58 (comment).

Tristan Colgate-McFarlane

"You can get all your daily vitamins from 52 pints of guiness, and a
glass of milk"

@clizzin
Contributor
clizzin commented Feb 19, 2015

Sorry folks, the project maintainer @igor47 is really busy with other projects for Airbnb these days. We're doing our best to find time to respond to pull requests, and we're sorry contributors are waiting so long. Please bear with us in the meantime.

@tdooner tdooner referenced this pull request in airbnb/smartstack-cookbook May 1, 2015
Open

Allow configurable path to `service` #13

@jolynch
Contributor
jolynch commented Sep 7, 2015

@bobtfish can you get a clean diff together and I can review/merge it?

@bobtfish
Contributor

@jolynch - how do you feel about this? Pulled in #86 also :)

@jolynch jolynch commented on the diff Sep 24, 2015
lib/synapse/service_watcher/etcd.rb
@@ -0,0 +1,166 @@
+require "synapse/service_watcher/base"
+
+require 'etcd'
+
+module Synapse
@jolynch
jolynch Sep 24, 2015 Contributor

This should be class Synapse::ServiceWatcher

Also can you add this watcher to the auto creation test?

@jolynch jolynch commented on the diff Sep 24, 2015
Gemfile.lock
@@ -21,6 +22,8 @@ GEM
archive-tar-minitar
excon (>= 0.28)
json
+ etcd (0.2.4)
@jolynch
jolynch Sep 24, 2015 Contributor

So while testing this I came across quite a pickle, which is that the etcd ruby gem < 0.3.0 cannot handle etcd 2.0+ (it errors all over the place regarding 404 errors)

ranjib/etcd-ruby#51 has been merged and my local testing indicates that version 0.3.0 seems to fix things.

@bobtfish do you think we should depend on an old etcd or the new one? I sorta prefer the new one but I'm not sure what the community progress on moving to etcd 2.0 is like?

@jolynch
jolynch Oct 6, 2015 Contributor

Now that we recommend installing via gem, can we just do ~> 0.2 ?

@jolynch jolynch commented on the diff Sep 24, 2015
lib/synapse/service_watcher/etcd.rb
+ else
+ if @backends != new_backends
+ log.info "synapse: discovered #{new_backends.length} backends (including new) for service #{@name}"
+ @backends = new_backends
+ true
+ else
+ log.info "synapse: discovered #{new_backends.length} backends for service #{@name}"
+ false
+ end
+ end
+ end
+
+ def watch
+ while !@should_exit
+ begin
+ @etcd.watch(@discovery['path'], :timeout => 60, :recursive => true)
@jolynch
jolynch Sep 24, 2015 Contributor

Alright so while testing I came across this gem. If my understanding is correct etcd makes us not only do our own heartbeats from nerve (hella writes), but we also end up basically constantly polling in this function because there is no capability to filter watch events?

Basically without coreos/etcd#633 or coreos/etcd#174 closed SmartStack scalability on etcd will be limited to ... not very much.

If that understanding is correct can we implement get children watches internally, where we do a lightweight "see if the list of children changed" operation and if that is different we actually go through loading all the data (aka run discover). This way we're at least not constantly pulling all of the etcd state?

@jolynch
Contributor
jolynch commented Nov 3, 2015

@bobtfish ping?

@jolynch
Contributor
jolynch commented Oct 23, 2016

Looks like the v3 etcd API may have solved some of the fundamental issues. I'll take another look at this in the next few days.

@philips
philips commented Oct 23, 2016

@jolynch if you have any problems/questions feel free to ping https://groups.google.com/forum/#!forum/etcd-dev or @heyitsanthony

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment