-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circuit Breaker support? #2846
Comments
To expand a bit on what I mean by Circuit Breakers, in the context of my team at work: What we have right now is an in-process library that observes how long some block of code (usually representing an external network request) takes to finish, and aborts quickly ("trips the breaker") when the average gets above some configurable threshold. While the breaker is tripped, it records aborted requests to having 0 latency in order to bring the weighted average back down until it's below the abort threshold, at which point the breaker is un-tripped and the external requests can resume. This works decently, except that every process in a many-worker app without shared memory (e.g. Python gunicorn) has to discover upstream outages independently since they don't have any shared state among them. In some cases worker processes are restarted quite frequently, and all circuit breaker status is lost with each restart. So, we are hoping to either (a) concoct a shared-state implementation of this and keep it in the application processes, or (b) rely on an external proxy implementation like linkerd to do it. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
Keeping this ticket open. For those watching, we've done some preliminary design work on this feature and learned some good things. |
@wmorgan would you be able to expand on the design work/investigation? Echoing the statement from @benley above about a distributed implementation of something that hystrix/resilience4j (at least from the java world) gives, is very intriguing. |
@tomsanbear @adleong looked into some of the details previously and can probably give you a data dump there. If you're interested in doing a contribution, we've got a lightweight process to go through. Not everything is documented yet as we're still getting it setup. Happy to walk you through what's required if you're interested! Jump into #contributors on slack and we can start going through the details =) |
Are we planning to prioritize the circuit-breaking functionality? is there any option in linked to limit the number of requests and connections at the proxy level or what is the right way to go about this? |
@jensoncs Richer client-side policies are planned for stable-2.12.0 |
Glad this topic was discussed today - Looks like there is some write up and diagrams that are available here not sure what the current state is - seems like a good topic for a design doc/Blog. |
Are there any plans for working on that topic? I see that it has been almost 6 months since there has been an update .. |
@sherifkayad there was a blog post at the turn of the year that mentioned it in the upcoming roadmap so I'd expect to see this implemented in the future |
@andrew-waters amazing! keeping an eye for that |
are there any updates on this @adleong? |
Hi @jon-depop. The groundwork to support client policy (such as circuit breakers) in the proxy is currently in progress and you can follow along at https://github.com/linkerd/linkerd2-proxy/pulls. |
@adleong any updates on this ? |
We're still working towards support for this in 2.13. Unfortunately we are not referencing this issue too much in related PRs in linkerd2 and linkerd2-proxy, but you can follow along with PRs in those repositories if you are interested. |
Hi folks, I'm very excited to let you all know that yesterday, we released edge-23.4.1, a release candidate for Linkerd 2.13, which features initial support for request-level HTTP circuit breaking. This circuit breaking is configured by adding annotations to Services that describe the failure accrual policy clients should use when communicating with that Service. We're still working on documentation for how to configure circuit-breaking in Linkerd 2.13, but in the meantime, if you're interested in trying it out on the edge release, you can lfind the annotations in the source code, here: linkerd2/policy-controller/k8s/index/src/outbound/index.rs Lines 455 to 503 in 546df1b
|
Feature Request
Linkerd 1.x and Istio (and various other service meshes) have documented methods of configuring Circuit Breakers:
It looks like linkerd 2 currently doesn't quite do the same thing, or at least it isn't documented clearly.
I found another issue inquiring about circuit breaking in this repo that's since been closed: #1255
@olix0r explained on slack:
So: might linkerd2 get some sort of Circuit Breaker functionality soon?
The text was updated successfully, but these errors were encountered: