Implement Serverside batching for Async API

Currently the AsyncAPI workers poll for a single message from the SQS queue and invoke the predict function once per message. Expose configuration in the AsyncAPI spec to configure workers to retrieve multiple messages in a single SQS poll request.

### Proposed solution

Update the AsyncAPI configuration to allow the specification of server side batching keys. The response from predictor should be validated to be a list. We can validate that the returned list has the same number of elements and assume that the return list elements are in the same order as the input to match the request id with the response.

```
# api.yaml
- name: my-api
  kind: AsyncAPI
  predictor:
    ...
    server_side_batching:  # (optional)
      max_batch_size: <int>  # the maximum number of requests to aggregate before running inference
      batch_interval: <duration>  # the maximum amount of time to spend waiting for additional requests before running inference on the batch of requests
```

The predictor needs to respond with a list 

### Questions
- Verify that [sqs_client.recieve_message](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sqs.html#SQS.Client.receive_message) waits for the specified duration if there aren't enough messages in the queue to satisfy the `max_batch_size`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement Serverside batching for Async API #2065

Proposed solution

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement Serverside batching for Async API #2065

Description

Proposed solution

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions