-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
re #149 support event streaming #151
Conversation
|
||
resp, err := client.Do(req) | ||
if err != nil { | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be logged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to handle stopped Marathon? Should I use exponential backoff (like this one: https://github.com/cenkalti/backoff) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that much complexity is needed, something as simple as giving every gorouting having a time.Ticker(time.Second)
, and then waiting for it before creating a new connection would be enough to ensure that Bamboo isn't spamming Marathon too hard, while still allowing the odd disconnection without error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just log error and sleep for 1 second then? Usually backoff is used in such sutiations, but if you fill it complex then simple delay here should be fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticker is redundant in case of just closed connection because closed connection doesn't mean to be a disaster, it could mean just "reconnect to me"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me describe whole scheme:
- try to open connection
- if failed, wait 1 sec and go to 1)
- if not, read data from connection
- if failed, log and go to 1 without delay
- if not, read event and go to 3)
Which means that:
If failure happened while reading event on open connection, then whole cycle will start from 1)
If failure happened while connection opening then each attempt will happen not more than once per second.
So:
It's ok to wait 1 second if it's impossible to open connection to marathon.
If error happened during "read" phase, then we should retry connection immediately because it could be just dropped connection, but Marathon is still able to serve requests. If it's not, then it will loop on 1)-2).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, it's roughly the same in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No :) EOF (as well as other read errors) will not happen if your server is down because connection will not be open :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.wikipedia.org/wiki/Byzantine_fault_tolerance Treating particular errors specially isn't reliable. There are many things that can go wrong other than a server just being up or down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meh...... ok, I will add Ticker. Still not sure that it's required here :)
Looks good for the most part, I've added a couple of comments. On a broader note, I'd love to see some of this broken out into smaller functions; a function to parse a single message and return an event would be great, as it'd allow for very easy testing. At the moment, this is a rather had function to test. |
@bluepeppers I've made some changes, including:
Currently |
I like this. I'd like to split it up more, and break the whole thing out behind an interface, but this is a move in the right direction. Seeing as this is opt in, I'd be happy merging it now, but will probably wait till Monday, and confer with @activars about the best way to test this mode with some decent traffic. |
@bluepeppers ok, lets wait what @activars will say :) |
registerMarathonEvent(conf) | ||
if conf.Marathon.UseEventStream { | ||
// Listen events stream from Marathon | ||
listenToEventStream(conf, eventSubAPI) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor consistency issue. Do you think i's better to align with the callback model function name? e.g. renaming listenToEventSteam
to registerMarathonEventStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you're right. Fixing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, stop :) listenTo<...>
was used to be consistent with listenToZookeeper
. registerMarathonEvent
will register Bamboo in Marathon, but listenToEventStream
will listen stream and will not register anything. So I will go with listenToEventStream
if you don't mind
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Maybe this is more accurate?
listenToMarathonEventStream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, no objections :) Will rename it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done :)
Looks good 👍 |
don't worry about the abstraction too much, we will refactor some common features in next release. |
docker build with this was broken for me-- use of the http.Client timeout requires go 1.4, the dockerfile uses go 1.2 |
You also may want to wait until the fix for mesosphere/marathon#1926 is included in a marathon release (probably 0.11). Otherwise there could in some cases be a delay for bamboo to receive events. |
@bsideup hey, sorry we haven't done anything on this. I've finally automated some bamboo integration testing stuff, and so I'll give this a shot on the staging cluster in the next week. |
Hi @bluepeppers ! Sounds great :) Ping me if anything will not work for you :) |
One minor thing; we have a username and password for marathon configurable: https://github.com/QubitProducts/bamboo/blob/master/configuration/marathon.go#L10-L15. The event stream request should probably support HTTP basic auth also |
@bluepeppers it was introduced after this PR :) But I can support it if you want |
@drewrobb good catch. Looks like ubuntu base image comes with Go 1.2 by default. Any idea how to fix it? |
766d294
to
af5a9df
Compare
@bluepeppers rebased on top of last master |
@bluepeppers implemented basic auth |
The go vet message at https://travis-ci.org/QubitProducts/bamboo/builds/80229954 was very useful. Without that change, the goroutine was listening to the same (last) marathon always, which causes the problem @drewrobb linked with marathon to be much more of an issue. I was having problems getting event streaming to work with this patch, but I think that go vet may have found the problem. |
@bluepeppers yes, already fixed :) |
@bluepeppers any news? |
+1 to getting this in, I have been using it for a while now successfully |
+1 |
Appologize we haven't got chance to test this out because there are a lot of other tasks we need to work on internally. We will deploy this on an internal test cluster to test it out in November. |
Since this is an optional mode I think it's pretty safe to merge, I know at
|
I'll merge this today (absent any issues). Sorry it took so long |
Awesome, thanks!
|
Yeah, thanks guys! |
@bluepeppers was it released btw? |
@bluepeppers ok, keep me in touch then :) Thanks! |
I'm happy to contribute this small, but very helpful improvement - event stream listener.
Starting from Marathon 0.9.0 it provides /v2/events endpoint, which is implementation of Server-sent Events. It's much better compared to event callbacks since there is no need to register Bamboo as callback on Marathon.
Implementation is backward compatible with 0.7.0 because this mode is disabled by default. It could be enabled in config like that:
NB: if "UseEventStream" is enabled then callbacks are not auto-registered.
When Bamboo will drop support for Marathon < 0.9.0 then this property could be
true
by default.Implementation was tested with Marathon 0.9.0, covered cases: