Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load shedding #40142

Merged
merged 1 commit into from
Jun 6, 2024
Merged

Load shedding #40142

merged 1 commit into from
Jun 6, 2024

Conversation

Ladicek
Copy link
Contributor

@Ladicek Ladicek commented Apr 18, 2024

Related to #36543

@Ladicek Ladicek requested a review from cescoffier April 18, 2024 15:06
@quarkus-bot quarkus-bot bot added area/dependencies Pull requests that update a dependency file area/vertx labels Apr 18, 2024
@quarkus-bot

This comment has been minimized.

@Ladicek
Copy link
Contributor Author

Ladicek commented Apr 18, 2024

Draft because this is very much work in progress. Sharing to maybe get some initial feedback.

If this is considered to be too niche for core Quarkus, I'd be fine with moving to Quarkiverse.

CC @ahus1 @vmuzikar @tmonney

Copy link
Member

@cescoffier cescoffier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

First, I like the overload algorithm you are using. Any reference on it?
Then, we have a connection limiter in Quarkus already, should we deprecate this in favor of the load shedding?

Also, I think it needs to be tested with long-running connection (gRPC streams, SSE, web sockets).

I'm also thinking that on overload, we may need to adjust the readiness Kube probe. WDYT?

@Ladicek
Copy link
Contributor Author

Ladicek commented Apr 22, 2024

The algorithm is a simplified version of https://github.com/Netflix/concurrency-limits/blob/master/concurrency-limits-core/src/main/java/com/netflix/concurrency/limits/limit/VegasLimit.java, I think there's a link in the javadoc already.

Then, we have a connection limiter in Quarkus already, should we deprecate this in favor of the load shedding?

Ah do we? I had no idea. Where can I learn more?

I think it needs to be tested with long-running connection (gRPC streams, SSE, web sockets).

Good point, I didn't try that at all. I'll check, but I doubt I'll see anything meaningful. The present implementation is heavily oriented to request/response style of interaction. Occasional stream likely won't do anything, and a streaming-heavy application would require a more involved implementation.

I'm also thinking that on overload, we may need to adjust the readiness Kube probe. WDYT?

Hmm, I'm not sure about that. That would be super coarse-grained.

@Ladicek Ladicek force-pushed the load-shedding branch 2 times, most recently from 7a4c509 to c48e093 Compare May 21, 2024 14:18
@quarkus-bot quarkus-bot bot added this to To do in Quarkus Documentation May 21, 2024
@Ladicek Ladicek marked this pull request as ready for review May 21, 2024 14:18
@Ladicek
Copy link
Contributor Author

Ladicek commented May 21, 2024

Just marked as ready for review. I fixed a couple of bugs in the implementation and added some rudimentary documentation.

The feature is not very heavily tested. @franz1981 do you think it would be possible to test this in a perf lab? I don't really know how that works, but I guess we have a test that stresses Quarkus application above its capacity, where this would help?

@quarkus-bot

This comment has been minimized.

Copy link

github-actions bot commented May 21, 2024

🙈 The PR is closed and the preview is expired.

@quarkus-bot quarkus-bot bot added the area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins label May 22, 2024
@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@Ladicek
Copy link
Contributor Author

Ladicek commented Jun 3, 2024

Rebased and fixed the conflict.

@quarkus-bot

This comment has been minimized.

@quarkus-bot

This comment has been minimized.

@Ladicek
Copy link
Contributor Author

Ladicek commented Jun 5, 2024

Rebased and fixed a few tiny issues. I believe this is ready now.

@quarkus-bot

This comment has been minimized.

Copy link
Member

@cescoffier cescoffier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should merge it and iterate.

Quarkus Documentation automation moved this from To do to Reviewer approved Jun 5, 2024
Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two small comments. I don't mind you merging and addressing them later so that we avoid some more conflicts.

@quarkus-bot

This comment has been minimized.

The overload detector uses a TCP Vegas based algorithm,
as implemented by Netflix Concurrency Limiters.

Priority load shedding uses 5 priority levels and 128 cohorts.
A simple cubic function is used to determine the threshold that
current CPU load has to reach to reject the current request.
@quarkus-bot
Copy link

quarkus-bot bot commented Jun 6, 2024

Status for workflow Quarkus Documentation CI

This is the status report for running Quarkus Documentation CI on commit 3b9c6ac.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

⚠️ There are other workflow runs running, you probably need to wait for their status before merging.

@gsmet gsmet added the triage/waiting-for-ci Ready to merge when CI successfully finishes label Jun 6, 2024
@quarkus-bot
Copy link

quarkus-bot bot commented Jun 6, 2024

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit 3b9c6ac.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.

@Ladicek Ladicek merged commit b20c79c into quarkusio:main Jun 6, 2024
55 checks passed
Quarkus Documentation automation moved this from Reviewer approved to Done Jun 6, 2024
@quarkus-bot quarkus-bot bot removed the triage/waiting-for-ci Ready to merge when CI successfully finishes label Jun 6, 2024
@Ladicek Ladicek deleted the load-shedding branch June 6, 2024 15:13
@quarkus-bot quarkus-bot bot added this to the 3.12 - main milestone Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dependencies Pull requests that update a dependency file area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins area/documentation area/vertx release/noteworthy-feature triage/flaky-test
Development

Successfully merging this pull request may close these issues.

None yet

3 participants