-
Notifications
You must be signed in to change notification settings - Fork 22
Adds documentation on concurrent workers #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
112 changes: 112 additions & 0 deletions
112
docs/serverless/workers/handlers/handler-concurrency.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,112 @@ | ||
| --- | ||
| title: Concurrent Handlers | ||
| --- | ||
|
|
||
| RunPod supports asynchronous functions for request handling, enabling a single worker to manage multiple tasks concurrently through non-blocking operations. This capability allows for efficient task switching and resource utilization. | ||
|
|
||
| Serverless architectures allow each worker to process multiple requests simultaneously, with the level of concurrency being contingent upon the runtime's capacity and the resources at its disposal. | ||
|
|
||
| ## Configure concurrency modifier | ||
|
|
||
| The `concurrency_modifier` is a configuration option within `runpod.serverless.start` that dynamically adjusts a worker's concurrency level. This adjustment enables the optimization of resource consumption and performance by regulating the number of tasks a worker can handle concurrently. | ||
|
|
||
| ### Step 1: Define an asynchronous Handler function | ||
|
|
||
| Create an asynchronous function dedicated to processing incoming requests. | ||
| This function should efficiently yield results, ideally in batches, to enhance throughput. | ||
|
|
||
| ```python | ||
| async def process_request(job): | ||
| # Simulates processing delay | ||
| await asyncio.sleep(1) | ||
| return f"Processed: {job['input']}" | ||
| ``` | ||
|
|
||
| ### Step 2: Set up the `concurrency_modifier` function | ||
|
|
||
| Implement a function to adjust the worker's concurrency level based on the current request load. | ||
| This function should consider the maximum and minimum concurrency levels, adjusting as needed to respond to fluctuations in request volume. | ||
|
|
||
| ```python | ||
| def adjust_concurrency(current_concurrency): | ||
| """ | ||
| Dynamically adjusts the concurrency level based on the observed request rate. | ||
| """ | ||
| global request_rate | ||
| update_request_rate() # Placeholder for request rate updates | ||
|
|
||
| max_concurrency = 10 # Maximum allowable concurrency | ||
| min_concurrency = 1 # Minimum concurrency to maintain | ||
| high_request_rate_threshold = 50 # Threshold for high request volume | ||
|
|
||
| # Increase concurrency if under max limit and request rate is high | ||
| if request_rate > high_request_rate_threshold and current_concurrency < max_concurrency: | ||
| return current_concurrency + 1 | ||
| # Decrease concurrency if above min limit and request rate is low | ||
| elif request_rate <= high_request_rate_threshold and current_concurrency > min_concurrency: | ||
| return current_concurrency - 1 | ||
|
|
||
| return current_concurrency | ||
| ``` | ||
|
|
||
| ### Step 3: Initialize the serverless function | ||
|
|
||
| Start the serverless function with the defined handler and `concurrency_modifier` to enable dynamic concurrency adjustment. | ||
|
|
||
| ```python | ||
| runpod.serverless.start({ | ||
| "handler": process_request, | ||
| "concurrency_modifier": adjust_concurrency, | ||
| }) | ||
| ``` | ||
|
|
||
| --- | ||
|
|
||
| ## Example code | ||
|
|
||
| Here is an example demonstrating the setup for a RunPod serverless function capable of handling multiple concurrent requests. | ||
|
|
||
| ```python | ||
| import runpod | ||
| import asyncio | ||
| import random | ||
|
|
||
| # Simulated Metrics | ||
| request_rate = 0 | ||
|
|
||
| async def process_request(job): | ||
| await asyncio.sleep(1) # Simulate processing time | ||
| return f"Processed: { job['input'] }" | ||
|
|
||
| def adjust_concurrency(current_concurrency): | ||
| """ | ||
| Adjusts the concurrency level based on the current request rate. | ||
| """ | ||
| global request_rate | ||
| update_request_rate() # Simulate changes in request rate | ||
|
|
||
| max_concurrency = 10 | ||
| min_concurrency = 1 | ||
| high_request_rate_threshold = 50 | ||
|
|
||
| if request_rate > high_request_rate_threshold and current_concurrency < max_concurrency: | ||
| return current_concurrency + 1 | ||
| elif request_rate <= high_request_rate_threshold and current_concurrency > min_concurrency: | ||
| return current_concurrency - 1 | ||
| return current_concurrency | ||
|
|
||
| def update_request_rate(): | ||
| """ | ||
| Simulates changes in the request rate to mimic real-world scenarios. | ||
| """ | ||
| global request_rate | ||
| request_rate = random.randint(20, 100) | ||
|
|
||
| # Start the serverless function with the handler and concurrency modifier | ||
| runpod.serverless.start({ | ||
| "handler": process_request, | ||
| "concurrency_modifier": adjust_concurrency | ||
| }) | ||
| ``` | ||
|
|
||
| Using the `concurrency_modifier` in RunPod, serverless functions can efficiently handle multiple requests concurrently, optimizing resource usage and improving performance. This approach allows for scalable and responsive serverless applications. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.