-
-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support message sizes > 256KB by using a third-party backend... #279
Comments
option 1 seems much better than option 2, because option 2 adds another piece of dependency and a place where things can break. |
Not sure how option 1 would work, as there is no safe place to put the knife. If it needs to divide an object then to recover the object at the other end, all of the pieces must be received by the same process, and then what happens if one or more of the pieces are lost? or what if the process dies while processing them? May be more places where things can break in option 1. Option 2 would have to be optional, and a user can always choose to store big objects somewhere else and pass URLs manually. |
Will you be accepting pull requests for it, option-2? |
we will review the PR ofcourse. please come with proper unit and integrations tests :) and mention me |
I don't know if this helps anyone who is thinking of adding this but there is a python implementation of the option 2 pattern using S3 for storage that is influenced by the AWS Java Extended client for SQS: https://github.com/archetype-digital/aws-sqs-extended. This extends boto3 with extra calls that could be used as a basis for a transport (I think). I've not reviewed the code in detail but it is tested under Python 3.7, 3.8 and 3.9, 99% test coverage and MIT licensed. I'm a long time user of Celery+SQS but don't know my way around the internals, but I'd really be interested in a solution to this issue and would be happy to help out where I can. |
I found this article very useful https://walid.dev/blog/saving-costs-asking-for-forgiveness-in-python/ |
this could be useful some use cases |
|
based on new library can we close this or we need to integrate and ensure it is supported in kombu? |
Ah, wait, sorry, the new lib is for addressing the same problem for SNS and not SQS, so not helpful here. Apologies. |
AWS have now released the extended client library for Python, allowing up to 2GB messages on SQS via S3: |
This adds support for handling large payloads in SQS. The 'sqs_extended_client' is imported and utilized for fetching file from S3 as payload when necessary. As Kombu asynchronously fetches new messages from the queue, not using the standard boto3 APIs, we have to manually fetch the s3 file, rather than rely on the sqs_extended_client to perform that action Relates to: celery#279
I've made a start on attempting to integrate the sqs_extended_client. You can view that here: https://github.com/celery/kombu/compare/main...Amwam:kombu:add-sqs-large-payload-support?expand=1 While this appears to work, I'm not sure if there are issues in the implementation. There are also features missing, such as automatically deleting the payload, after the task has been completed. The core issue I've run into is that way kombu fetches messages from SQS is via the HTTP API, rather than using boto3, so the extended client isn't used when retrieving messages, only for publishing. Another PR references a desire to convert to using boto3 for calls, but this seems like a bigger refactoring is required to make this happen in a performant way. |
This adds support for handling large payloads in SQS. The 'sqs_extended_client' is imported and utilized for fetching file from S3 as payload when necessary. As Kombu asynchronously fetches new messages from the queue, not using the standard boto3 APIs, we have to manually fetch the s3 file, rather than rely on the sqs_extended_client to perform that action Relates to: celery#279
good job on starting work on it. |
SQS only supports messages up to 256KB. Given that limitation, its very easy to hit the limit and fail your task submission. Here's a simple example:
There are two ways to fix this that I see.
The text was updated successfully, but these errors were encountered: