-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dlq): InvalidMessage exception refactored to handle multiple invalid messages #50
Conversation
I am not convinced that we should be trying to solve this problem. The processing step interface is very much designed to operate on a single message at a time. The scenario that you've described seems to be a processing strategy where each |
@lynnagara There are already examples in prod that are doing this type of batching: https://github.com/getsentry/sentry/blob/master/src/sentry/sentry_metrics/multiprocess.py#L225 Was this style of processing ever meant to be supported? |
Since there are strategies that work with a batch of messages and we'd like to pass each broken message to the DLQ, I think a single exception with multiple messages might be the easiest way to do so.
Does this mean the batching strategies would be modified to send each message in a batch to a "next step" for processing so individual messages could throw an exception? |
@rahul-kumar-saini Can you share which other strategies you're referring to? In the example @evanh linked about https://github.com/getsentry/sentry/blob/master/src/sentry/sentry_metrics/multiprocess.py#L302 So why couldn't we have the strategy that is calling this one just call |
Okay so I think we might be talking about two different things here, I do not want to do anything like: The DLQ is meant to be the first processing step and will only receive one message at a time, and just forwards it one at a time to the next processing step. The point is that at any processing step along the way after the DLQ, we can throw an exception and it should bubble back up to the main try/catch in the DLQ code. The issue is that some processing steps/strategies (like the BatchingStrategy) will start collecting these messages and process a batch at once in parallel, so how do we handle the case where multiple messages in the batch are broken? I agree, we should only submit one message at a time to the DLQ, however it should be able to handle an exception that is thrown with more than one message in order to handle all of the bad messages in the batch properly. |
@rahul-kumar-saini I don't think we are talking about different things. I know that you wouldn't explicitly submit to a DLQ strategy. My comment was only talking about what happens in the Now I am not opposed to this strategy that handles a group of messages if we really need it, but I would like us to clarify the real use cases where this is needed first. Is there only that one example or do you have others in mind? |
Right ok I misunderstood what you were trying to say, I think the real use case here is to avoid failing an entire batch if a single message is broken. If at any point in the processing of a message (in a batch) we throw an exception, we still want the successful messages to be processed and committed with only the bad ones being passed to the DLQ. I think if we don't collect the bad ones to throw an exception at once, the good ones would also fail? Here's another example linked in the original ticket for this task:
|
That's the same example that Evan linked above and my comments were relating already to though. Is the problem is specific to this one strategy? If that is true, should we consider either a) putting this strategy next to it in the Sentry codebase instead of Arroyo, or b) just updating that strategy so it processes and rejects one message at a time instead of a batch? |
I will try to provide some more context as the actual problematic step is being missed. The relevant use case in sentry is this one, not the produce step https://github.com/getsentry/sentry/blob/master/src/sentry/sentry_metrics/multiprocess.py#L481-L488 Now let's take a step back and talk about requirements. I think the approach in this PR may have a flaw. The type of message attached to the exception is not the same type of message the |
This is the part that feels uncanny about this to me. The interface of the strategy exclusively deals with individual messages (that just happen to get executed in a batch... but IMO this detail should should stay internal to the strategy). Just a wild idea I'll put out there: could we just maybe throw one at a time and have the DLQ strategy make subsequent calls to poll() until it has collected all the |
Not always, see the examples above. We need to cover both cases: when a step receives a batch and when it receives individual messages and batches them. This case https://github.com/getsentry/sentry/blob/master/src/sentry/sentry_metrics/multiprocess.py#L481-L488 could potentially be turned into a case where the processing step also takes care of batching, though that seems a work around. The streaming pipeline is designed to be able to change the message type from one step to the next so we should support that in the dead letter queue. |
My point is just that the concept of a "batch" exists only because a message can be anything you like including a grouping of messages. But it feels to me quite specific to those processing strategies, and I am not sure if it's reusable and generic enough to live in Arroyo rather than next to the code that is actually creating those batches of that type. Anyway, I am not going to block this PR on this issue, just my $0.02. |
Would the processing of the batch not stop entirely as soon as one exception is thrown that goes over let's say a count limit? |
I've changed the type for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overview
submit()
orpoll()
callChanges
InvalidMessage
exception toInvalidMessages
with the ability to throw an exception with a sequence of messages instead of just oneTesting