New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Event Hub-triggered function *always* checkpoints -- I want to control checkpointing #947
Comments
bump can anyone respond please? Is this something that's on your roadmap? @alrod @alexkarcher-msft @brettsam Thank you! |
Same problem here! |
Being tracked here |
FYI we have a design proposal out now if you want to review: https://github.com/jeffhollan/retry-design |
@jeffhollan the design proposal does not seem to solve another problem - not this one. The feature I need, is to be able to re-read the 10 mins latest messages over and over again. So I don't need checkpointing at all. This is possible with the legacy Microsoft.Azure.EventHubs |
Very interesting. Likely worth opening another issue for that. I know we have issues around being able to replay (e.g. move checkpoints to some point in time), but being able to do it over-and-over again I’m not sure I’ve heard that one or if the way we are thinking about moving checkpoints would work. |
Jeff, I think this is a good solution. Thanks! Really appreciate the effort. |
Jeff, would it not be worth to add the circuit break strategy to your proposal? It would be optional and would work similar to your scenario 3 but with a difference: Scenario circuit-breaker: The execution occurs, an exception is thrown, there is a retry policy defined in host.json. The execution will be marked as failed. The stream WILL NOT continue on with the checkpoint. The retry policy will be honored. After the final retry has happened (if an upper limit was given and circuit-break strategy is defined) function is stopped and an alert is generated. Not sure if this is something that can be achieved today. Probably the alert could be something outside of the function responsibility. |
I've already posted this on Azure Advisors but no response from Microsoft there. Retrying here...
This document (https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-event-hubs) states plainly that at the end of execution of an Event Hub-triggered function, the function will checkpoint whether there was an error or not. (I believe WebJobs have the same behavior.) Unfortunately this doesn't give us enough control. Perhaps there was a throttling error, or some other condition that means the messages can't be processed successfully. In such cases I would like to be able to tell the EventProcessorHost not to checkpoint.
I consider being able to control checkpointing as a must-have feature if you're processing events from Event Hubs (or IoT Hub). Without this control, Event Hub-triggered functions (and WebJobs) are not sufficiently reliable, because it's obvious that you can lose messages in error scenarios. Cloud-native apps are supposed to handle failures gracefully, but that's not the case here.
Just to be completely sure about this, I wrote a function to test this out. It reads in batches of messages and throws an exception at the end. Upon launching the function, it happily reads through all of the messages in the Event Hub, checkpointing every batch. So I'm quite confident that the EventProcessorHost DOES checkpoint even if there is an exception.
So is there a way to control checkpointing currently? (And I don't mean messing with checkpoint blobs.) I don't think there is, and if that's truly the case, then is it possible that you could add a feature to tell the EventProcessorHost whether or not to checkpoint?
Thanks
The text was updated successfully, but these errors were encountered: