Need an mechanism in awssqssource to control parallelism #1343

mikhno-s · 2023-03-01T08:25:44Z

We have long-living jobs (10-15 min) in knative-serving that are triggered by messages from sqs.
And I see the problem here:

awssqssource reads all messages from sqs and removes them. But I want to have the ability to read one message at once and remove it only after the successful http response code from ksvc. Even when I use a memory broker with buffer=1.

I need to have the ability to control the parallelism of the processing mechanism and I don't need fast process all messages but to consume 1 message, send to ksvc (or broker), wait until finished, and only then remove the message from sqs, repeat.

Is it possible to do it with triggermesh?

I found a lot of issues regarding a problem with such a scheme and looks like people need to have a mechanism for long-living jobs triggered by http.

odacremolbap · 2023-03-01T08:47:39Z

Thanks for reaching @mikhno-s

Currently our SQS Source works like this:

A number of concurrent receivers are determined by CPU available, this is not configurable at the moment.
Each receiver will read a batch from SQS with a maximum of 10 items per batch.
For each element in the batch we generate a CloudEvent to the Broker/Sink, and only delete from SQS once we have the ACK

You seem to be asking for serialization control, so that there are a maximum of N items on the fly, being N = 1 i n your case, is it?

Can you add some details on your scenario? It sounds like you have some processor that should only process one element at a time, which is not common to hear in event scenarios.

mikhno-s · 2023-03-01T08:59:34Z

Thanks for the quick response.

Scheme is simple:
I have the worker that can process the 1 requests at once and processing takes around 10 minutes.

I have an sqs as an input channel and there are around 1k messages to process.

I want to control exactly how many messages will be in-flight because otherwise I reach the activator's (knative-serving component) timeout if multiple messages will be send to the sink at once.

It's also typical flow for ML-like jobs.

jmcx · 2023-03-06T11:42:43Z

Hi @mikhno-s thanks for the details on this requirement. Any chance you can jump on a call sometime this week to discuss it with someone from the engineering team (probably Pablo) and myself? I can set up a zoom anytime that suits you. Cheers !

mikhno-s · 2023-03-06T11:50:23Z

Sure, what is the preferable way to communicate with you, guys?
Usually, I am free to call since 7AM to 15PM (UTC), but need to discuss the details.

jmcx · 2023-03-06T11:57:06Z

Great! We can host a Zoom meeting if that works for you? Some timeslots that could work: Tuesday 2pm UTC, Wednesday 2pm UTC, Thursday 10am or 2pm UTC.

mikhno-s · 2023-03-06T12:52:05Z

Tuesday 2pm UTC +

jmcx · 2023-03-06T13:31:33Z

Great, is there a particular email I should use? Feel free to contact me at jonathan@triggermesh.com

jmcx · 2023-03-07T09:50:30Z

Hi @mikhno-s , I've scheduled a meeting for today and I can just share the Zoom link here in case that works for you: https://triggermesh.zoom.us/j/81980530997?pwd=OWFqMUppc2lxSHRtZUxqcTEvN3lLQT09

Cheers

jmcx · 2023-03-10T13:29:15Z

After talking with the team, I logged this issue to add concurrency control on broker triggers: triggermesh/brokers#127. Feedback welcome.

shabemdadi · 2023-12-08T16:21:34Z

Thanks for reaching @mikhno-s

Currently our SQS Source works like this:

A number of concurrent receivers are determined by CPU available, this is not configurable at the moment.

Each receiver will read a batch from SQS with a maximum of 10 items per batch.

For each element in the batch we generate a CloudEvent to the Broker/Sink, and only delete from SQS once we have the ACK

You seem to be asking for serialization control, so that there are a maximum of N items on the fly, being N = 1 i n your case, is it?

Can you add some details on your scenario? It sounds like you have some processor that should only process one element at a time, which is not common to hear in event scenarios.

How does this flow change when using a Kafka broker? More specifically, when do we delete the event from the SQS queue? Is it upon publishing to the kafka topic or upon that record receiving a successful response from the sink?

For context, I am interested in the concurrency control outlined in triggermesh/brokers#127 but saw some activity that maybe using a Kafka Broker could mimic this control in lieu of that feature being implemented. I am trying to think about the failure scenarios using a Kafka Broker with an SQS source (e.g. could there be a situation where the Kafka broker goes down and the messages have been deleted from the SQS queue before being processed by the sink?)

jmcx added the feature request Request for a new feature label Mar 1, 2023

jmcx changed the title ~~Need an mechanism in awssqssource to control parallelism~~ Need an mechanism in awssqssource to control parallelism Mar 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need an mechanism in awssqssource to control parallelism #1343

Need an mechanism in awssqssource to control parallelism #1343

mikhno-s commented Mar 1, 2023

odacremolbap commented Mar 1, 2023

mikhno-s commented Mar 1, 2023 •

edited

Loading

jmcx commented Mar 6, 2023

mikhno-s commented Mar 6, 2023

jmcx commented Mar 6, 2023

mikhno-s commented Mar 6, 2023

jmcx commented Mar 6, 2023

jmcx commented Mar 7, 2023 •

edited

Loading

jmcx commented Mar 10, 2023

shabemdadi commented Dec 8, 2023

Need an mechanism in awssqssource to control parallelism #1343

Need an mechanism in awssqssource to control parallelism #1343

Comments

mikhno-s commented Mar 1, 2023

odacremolbap commented Mar 1, 2023

mikhno-s commented Mar 1, 2023 • edited Loading

jmcx commented Mar 6, 2023

mikhno-s commented Mar 6, 2023

jmcx commented Mar 6, 2023

mikhno-s commented Mar 6, 2023

jmcx commented Mar 6, 2023

jmcx commented Mar 7, 2023 • edited Loading

jmcx commented Mar 10, 2023

shabemdadi commented Dec 8, 2023

mikhno-s commented Mar 1, 2023 •

edited

Loading

jmcx commented Mar 7, 2023 •

edited

Loading