Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent metadata refresh #172

Closed
wants to merge 8 commits into from

Conversation

crxpandion
Copy link

Changes broker metadata to refresh concurrently rather than sequentially. If all brokers error out the last call to Broker.GetMetadata(...) that returned an error is exposed and the client enters the resurrection retry loop.

This also make users of sarama.Client (e.g. consumer/producer) be solely responsible for disconnecting brokers who timeout or misbehave.

Related to #15

@crxpandion
Copy link
Author

seems to have an issue in the go 1.1 build. Investigating...

@crxpandion
Copy link
Author

unable to reliably replicate locally - are you guys aware of any issues with functional_test.go?

@wvanbergen
Copy link
Contributor

Yes, it seems like you got caught in our functional test heisenbug which strikes every now and then :(
I triggered a rebuilt, let's see if this fixes it.

@wvanbergen
Copy link
Contributor

The tests are passing now. If you have any idea what could cause this bug (most likely due to concurrency), please let us know.

// Public channels
Get chan *fanoutResult
// optional channel to notify outside when clean up is complete
Done chan struct{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do these channels have to be public? This seems like an internal thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some godoc to explain what these struct members are for? The name Get is not very enlightening to me.

@wvanbergen
Copy link
Contributor

@eapache thoughts?

@eapache
Copy link
Contributor

eapache commented Nov 5, 2014

I'm not sure I like this approach in general. To quote old me in #15 "connecting to all the brokers just to pick one and drop the others is gross". It's also inefficient, and could cause issues on clusters with a large number of brokers. Originally in that issue I'd suggested that we should wait a small amount before trying the next broker, but I now think that's basically over-complicated. The much better solution to achieve that behaviour is just to reduce the connect timeout on the brokers to something reasonable for LANs (if it takes more than 2 seconds to connect over a LAN, something is probably wrong).

@crxpandion why did you decide to implement this? What problem precisely were you trying to solve?

@crxpandion
Copy link
Author

@eapache - I started on this path when testing sarama's ability to weather broker failures. I noticed that even with smaller timeouts, as you suggest, the metadata refresh can take a long time, particularly in large clusters. That delay is not only frustrating when running lots of integration type tests, but could be detrimental in a production system.

As for the inefficiency - I can see how its not computationally efficient but there it seems we should mainly be concerned with getting the cluster metadata quickly, so that the user of this library can get on with their computation. The problems with large cluster (20-100s) could be solved by throttling the number of outbound attempts.

Overall I think that abstracting the approach to fetching metadata (regardless of strategy) from the refreshMetadata function, which mainly contains retry logic, serves a lot to make this section of code not as gross.

Anyways I appreciate the feedback.

@wvanbergen re the function_test heisenbug. I think there are at least two modes of failure that i see locally: the one that is easy to diagnose is one that is the result of OOM killers in the vm (I am not sure on how travis ci works so this might not be the issue we saw above) and judging from recent commits it seems you guys are aware of that one. An alternate approach that might make the memory increase more reliable would be to reduce the number of brokers to 3.

Judging from stack traces taken when metadata is refreshed, I think there is at least one other failure mode besides the OOM killer, but I haven't been able to isolate it yet - I might have time later this week to look into it again later this week if you guys want.

@eapache
Copy link
Contributor

eapache commented Nov 6, 2014

OK, so I'm not opposed to the general concept of trying multiple brokers concurrently for metadata, but:

  • we want to avoid overkill in the common case (where the first broker responds quickly)
  • Sarama is used in very large deployments (thousands of clients connecting to a reasonably-large cluster); the current approach is likely to DDoS the cluster (and/or zookeeper) in that case by generating (#clients * #brokers) metadata requests simultaneously when a broker fails over
  • there are probably smaller deployments where "try all the brokers right away" is the best approach

Given the above, here are a few suggestions:

  • the fanout timing and growth pattern should be configurable so that users can tune it appropriately for their use case
  • the default should not be too aggressive; perhaps linearly try another broker every 500ms?

@crxpandion
Copy link
Author

Sure both of those suggestions make sense - I'll work to amend my patch. I am coming down with a cold so it might take me a few days. :(

I feel that the "thundering horde" problem is often solved by random jitter in delay before attempting refresh. What are you thoughts on this strategy? Obviously we'd want to use it in conjunction with fanout throttling.

@eapache
Copy link
Contributor

eapache commented Nov 7, 2014

I don't have a problem with jitter to avoid thundering horde - I'm not sure it's necessary for the first connection of a refresh (clusters should be OK as long as each client makes only one request, not n requests), but it may be worth adding as fanout occurs.

Let us know when you've got a new version of the patch ready.

@crxpandion
Copy link
Author

@eapache I added a throttle into the fanout. The default is 1 connection attempt at a time. The logic is a tad more complicated.

} else {
for _, broker := range client.brokers {
Logger.Printf("Fanning out %v\n", broker)
b := broker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this intermediate variable seems unnecessary?

@crxpandion
Copy link
Author

I rebased on shopify/sarama to accommodate the seedBroker change so apologizes for the messed up the history of this branch.

The way seedBrokers are handled definitely complicates the code. Is there a reason for the logical separation between user supplied broker addresses and brokers found via metadata requests?

@eapache
Copy link
Contributor

eapache commented Nov 13, 2014

Per the comment in client.go:

// the broker addresses given to us through the constructor are not guaranteed to be returned in
// the cluster metadata (I *think* it only returns brokers who are currently leading partitions?)
// so we store them separately

There are also some bits documented in my initial post on #15.

@eapache
Copy link
Contributor

eapache commented Dec 22, 2014

@crxpandion do you plan to revisit this?

@crxpandion
Copy link
Author

The PR as it stand fulfills what was requested - however I feel that it does not help alleviate the root issues with this section of the code. Therefore I'm inclined to stash this until the client gets sorted out further.

Thoughts?

@eapache
Copy link
Contributor

eapache commented Dec 24, 2014

I'm not sure how much more change there is really going to be to the client at this point. The current design certainly has issues, but as per the fairly extensive conversation in the other tickets there doesn't seem to be a magic bullet that doesn't introduce other issues. I wouldn't hold your breath waiting for that to happen.

@eapache
Copy link
Contributor

eapache commented Feb 24, 2015

The client does need to be worked on before this can be tackled properly, but once that is done this will probably have to be rewritten anyways, so there's not much point in leaving the PR open. It can/should be mined for ideas at some point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants