-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CallActivityWithRetryAsync ability to impact logging behavior #586
Comments
Thanks for the suggestion. The idea seems reasonable to me. The challenge is that while we can control logging done by the Durable extension, we don't have control over the logging behavior of the Functions runtime. This means someone would have to change both the extension and the host to make this scenario work the way you want. It's probably doable, but I think we'd need to get a good number of votes for this capability or rely on an externally contributed PR since I expect it will be difficult to prioritize. Let me know if you think there is a compromise that could result in a simpler fix. |
Thanks for the fast answer! At which point is the exception being debounced and loggued? Within the activity? Within the proxy layer of extensions? Within functions sdk? Webjob sdk? A mix? |
The layers of a durable function execution look something like this: [Storage] --> [DTFx] --> [WebJobs] --> [Durable Extension] --> [User Function] When there is a failure, both [WebJobs] (which is what I meant by "functions runtime") and [Durable Extension] will write logs to Application Insights. The failure propagates all the way back to [DTFx] (Durable Task Framework) which manages the durable retry - so each retry goes all the way through the entire stack. |
Thanks for the precision. The highest level of user code where the exception surfaces is the body of the run method in the activity. Who gets to call that method? The durable extension right? |
The problem is that the exception needs to flow all the way down the stack to [DTFx] in order for retry to work correctly. The other problem we have is that it is the orchestrator which manages retries, not the activity function, so activity functions actually don't know anything about whether they are being retried. BTW: Someone just pointed out to me that there is actually a simple bug in our function logging where we are logging replayed exceptions to Application Insights (i.e. the orchestrator will re-log the same exception multiple times, once for each replay of a failed activity). Could this be the cause of the log-spam you're seeing? |
Hum I get it, you don't handle the state of the activity in any other way than the exception showing up at the DTFx level to know whether it succeeds or fails and whether or not your should retry. So it's not possible to intercept the exception without breaking the retry mechanisms today. This bug could explain some of the behavior, but even just in the console log, all the activity attempts that are failing are logging errors. When arguably until the last retry it (the orchestration gives up) it should be written at most as a warning (and then again, having the ability to handle that myself as the consumer would definitively solve the issue and avoid any opiniated decisions) |
I came across the very same idea for a bit of a different use case. I sent a request to an external API and then need to poll a status URL every couple of seconds until the result is ready to be fetched.
If there is no way to change the behavior of the exception logging, is there a similar easy way to achieve this with durable functions? I had it before using a while loop etc, but this here would be much cleaner. |
I'm in the same position. I'd like to use an Activity function with exponential backoff to keep trying to send callbacks until some considerable time has passed. Works very well but for the exceptions cluttering up the Azure logging :( |
I am firing a load of requests to a CosmosDB instance and absolutely expect some of those calls to time out. In addition to Which would be the minimal "logLevel" settings for getting rid of those log entries? |
I like this feature request |
I'm having the exact same issue. We're using the Azure DevOps REST API to create a new Azure DevOps project, and we use the CallActivityWithRetryAsync method to poll for the status of the operation. Would be nice if we could prevent our logs from being cluttered with error logs for each call that results in a "pending" status. |
I have a Durable Function that does some renaming on images and then uploads these images to my companies SharePoint Online. The only thing that's not so nice about this solution, is the flood of error logs that is written to ApplicationInsights, so it would be really nice to have a feature like this! |
I have the same question here: https://stackoverflow.com/questions/75790044/retry-in-azure-durable-functions-without-throwing-exceptions and was guided towards this thread. Idea: enrich the log message logged as part of this error with information about the retry attempt count. This will make it possible for us to ignore these errors until the max number of retries have been attempted. |
Adding @lilyjma for visibility |
@cgillum are you working this issue or is there somewhere this is being tracked? Would like to see this enhancement make it through. |
Our team is also very interested in this enhancement and hoping it gets implemented soon. |
We haven't been able to prioritize this, unfortunately. One thing people can try in the meantime is to add additional levels of retry at the activity function level. This unfortunately requires code changes, but I suspect each language will have some libraries available to help with this. It might be possible to leverage the Azure Functions built-in retry policies, but I'm not aware of whether it applies to all trigger types or if (like the documentation seems to suggest) that it only works with specific trigger types. |
+1 on having retry status. |
where do I vote for this change to be prioritised? I'd like to just log the error as information instead of exception during retry, and only see it as an exception after retry timed out. |
Let's suppose that I have an acivity that I expect to fail pretty often because it's relying on a "not so great" API. Now calling that activity I leverage the
CallActivityWithRetryAsync
method to make sure my orchestration is more likely to execute properly.However each failed attempt is logging everything as errors which clutters my telemetry with "exceptions". In that scenario I'd like to be able to provide a callback that handles logging, it'd be up to my code to determine whether or not to log the failed attempts and if yes, at which level.
That callback definition could be on the RetryOptions object, a property like
public Action<Exception, int, bool, ILogger> HandleLogging {get;set;}
(the exception happening, the number of attempts, is it the last attempt or not based on the options, logger to leverage)The text was updated successfully, but these errors were encountered: