|
1 | 1 | # Understanding Scaling Behavior<a name="scaling"></a> |
2 | 2 |
|
3 | | -Concurrent executions refers to the number of executions of your function code that are happening at any given time\. You can estimate the concurrent execution count, but the concurrent execution count will differ depending on whether or not your Lambda function is processing events from a stream\-based event source\. |
| 3 | +Concurrent executions refers to the number of executions of your function code that are happening at any given time\. You can estimate the concurrent execution count, but the concurrent execution count will differ depending on whether or not your Lambda function is processing events from a poll\-based event source\. If you create a Lambda function to process events from event sources that aren't poll\-based \(for example, Lambda can process every event from other sources, like Amazon S3 or API Gateway\), each published event is a unit of work, in parallel, up to your account limits\. Therefore, the number of events \(or requests\) these event sources publish influences the concurrency\. You can use the this formula to estimate your concurrent Lambda function invocations: |
| 4 | + |
| 5 | +``` |
| 6 | +events (or requests) per second * function duration |
| 7 | +``` |
| 8 | + |
| 9 | + For example, consider a Lambda function that processes Amazon S3 events\. Suppose that the Lambda function takes on average three seconds and Amazon S3 publishes 10 events per second\. Then, you will have 30 concurrent executions of your Lambda function\. |
| 10 | + |
| 11 | +The number of concurrent executions for poll\-based event sources also depends on additional factors, as noted following: |
4 | 12 | + **Poll\-based event sources that are stream\-based** |
5 | 13 | + Amazon Kinesis Data Streams |
6 | 14 | + Amazon DynamoDB |
7 | 15 |
|
8 | 16 | For Lambda functions that process Kinesis or DynamoDB streams the number of shards is the unit of concurrency\. If your stream has 100 active shards, there will be at most 100 Lambda function invocations running concurrently\. This is because Lambda processes each shard’s events in sequence\. **Poll\-based event sources that are not stream\-based**: For Lambda functions that process Amazon SQS queues, AWS Lambda will automatically scale the polling on the queue until the maximum concurrency level is reached, where each message batch can be considered a single concurrent unit\. AWS Lambda's automatic scaling behavior is designed to keep polling costs low when a queue is empty while simultaneously enabling you to achieve high throughput when the queue is being used heavily\. |
9 | 17 |
|
10 | | - Here is how it works: |
11 | | - + When an Amazon SQS event source mapping is initially enabled, Lambda begins long\-polling the Amazon SQS queue\. Long polling helps reduce the cost of polling Amazon Simple Queue Service by reducing the number of empty responses, while proving optimal processing latency when messages arrive\. As the influx of messages to a queue increases, AWS Lambda automatically scales up polling activity until the number of concurrent function executions reaches 1000, the account concurrency limit, or the \(optional\) function concurrency limit, whichever is lower\. SQS Event Sources support an initial burst of 5 concurrent function invocations and increase concurrency by 60 concurrent invocations per minute\. |
12 | | - + Lambda monitors the number of [inflight messages](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html#inflight-messages), and when it detects that this number is increasing, it will increase the polling frequency by 20 [ReceiveMessage](https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_ReceiveMessage.html) requests per minute and the function concurrency by 60 calls per minute\. As long as the queue remains busy, scale up continues until at least one of the following occurs: |
13 | | - + Polling frequency reaches 100 simultaneous ReceiveMessage requests and function invocation concurrency reaches 1,000\. |
14 | | - + The account concurrency maximum has been reached\. |
15 | | - + The per\-function concurrency limit of the function attached to the SQS queue \(if any\) has been reached\. |
| 18 | + When an Amazon SQS event source mapping is initially enabled, Lambda begins long\-polling the Amazon SQS queue\. Long polling helps reduce the cost of polling Amazon Simple Queue Service by reducing the number of empty responses, while providing optimal processing latency when messages arrive\. |
16 | 19 |
|
17 | | - When AWS Lambda detects that the number of inflight messages is decreasing, it will decrease the polling frequency by 10 ReceiveMessage requests per minute and decrease the concurrency used to invoke your function by 30 calls per minute\. |
| 20 | + As the influx of messages to a queue increases, AWS Lambda automatically scales up polling activity until the number of concurrent function executions reaches 1000, the account concurrency limit, or the \(optional\) function concurrency limit, whichever is lower\. Amazon Simple Queue Service supports an initial burst of 5 concurrent function invocations and increases concurrency by 60 concurrent invocations per minute\. |
18 | 21 | **Note** |
19 | 22 | [Account\-level limits](http://docs.aws.amazon.com/lambda/latest/dg/limits.html) are impacted by other functions in the account, and per\-function concurrency applies to all events sent to a function\. For more information, see [Managing Concurrency](concurrent-executions.md)\. |
20 | | -+ **Event sources that aren't stream\-based** – If you create a Lambda function to process events from event sources that aren't stream\-based \(for example, Lambda can process every event from other sources, like Amazon S3 or API Gateway\), each published event is a unit of work, in parallel, up to your account limits\. Therefore, the number of events \(or requests\) these event sources publish influences the concurrency\. You can use the this formula to estimate your concurrent Lambda function invocations: |
21 | | - |
22 | | - ``` |
23 | | - events (or requests) per second * function duration |
24 | | - ``` |
25 | | - |
26 | | - For example, consider a Lambda function that processes Amazon S3 events\. Suppose that the Lambda function takes on average three seconds and Amazon S3 publishes 10 events per second\. Then, you will have 30 concurrent executions of your Lambda function\. |
27 | 23 |
|
28 | 24 | ## Request Rate<a name="concurrent-executions-request-rate"></a> |
29 | 25 |
|
30 | | -Request rate refers to the rate at which your Lambda function is invoked\. For all services except the stream\-based services, the request rate is the rate at which the event sources generate the events\. For stream\-based services, AWS Lambda calculates the request rate as follow: |
| 26 | +Request rate refers to the rate at which your Lambda function is invoked\. For all services except the stream\-based services, the request rate is the rate at which the event sources generate the events\. For stream\-based services, AWS Lambda calculates the request rate as follows: |
31 | 27 |
|
32 | 28 | ``` |
33 | 29 | request rate = number of concurrent executions / function duration |
|
0 commit comments