Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add webhook to rerun failed (or terminated) orchestrations #1243

Closed
thdotnet opened this issue Feb 28, 2020 · 7 comments · Fixed by #1545
Closed

Add webhook to rerun failed (or terminated) orchestrations #1243

thdotnet opened this issue Feb 28, 2020 · 7 comments · Fixed by #1545
Assignees
Labels
Enhancement Feature requests.

Comments

@thdotnet
Copy link

At this moment, there are a few operations / endpoints when using Durable Functions (IDurableOrchestrationContext):

  • statusQueryGetUri

  • sendEventPostUri

  • terminatePostUri

  • purgeHistoryDeleteUri

I would like the possibility to re-execute a workflow, and the reason is as I'm not the caller of the Starter function so I cannot provide the same input parameters. I think this feature could be useful specially when due some activity exceeds the retry count, but after some time, there are no problems anymore and I need to reprocess.

@ghost ghost added the Needs: Triage 🔍 label Feb 28, 2020
@ConnorMcMahon
Copy link
Contributor

ConnorMcMahon commented Feb 28, 2020

It sounds like you want a retryOrchestration endpoint. I could potentially see the use in that.

@cgillum, thoughts?

EDIT: removed my workaround, as Chris had a far better idea for how to get this behavior today.

@ConnorMcMahon ConnorMcMahon changed the title rerun workflow Add webhook to rerun failed (or terminated) orchestrations Feb 28, 2020
@cgillum
Copy link
Collaborator

cgillum commented Feb 29, 2020

It's an interesting feature idea for sure. The closest thing we have today is the Rewind API, which is designed to re-run only the most recent logic after a failure occurs. However, this is still in preview because there are a lot of edge cases where it doesn't work.

A restart API is interesting because it's conceptually very simple and would probably be easy to implement. Basically, we just need to query the input from the existing orchestration and then create a start message to restart it.

But given this, could you implement it yourself as well? For example:

[FunctionName("RestartOrchestration")]
public static async Task<HttpResponseMessage> RestartOrchestration(
    [HttpTrigger(AuthorizationLevel.Function, methods: "post", Route = "orchestrations/{instanceId}/restart")] HttpRequestMessage req,
    [DurableClient] IDurableClient client,
    string instanceId)
{
    DurableOrchestrationStatus status = await client.GetStatusAsync(
        instanceId,
        showHistory: false,
        showHistoryOutput: false,
        showInput: true);

    // TODO: Check the runtime status to make sure it's in a restartable state

    await client.StartNewAsync(
        orchestratorFunctionName: status.Name,
        instanceId: status.InstanceId,
        status.Input);

    return client.CreateCheckStatusResponse(req, instanceId);
}

@cgillum
Copy link
Collaborator

cgillum commented Feb 29, 2020

The more I think about this, the more I think we should have this built in. It would be a great tool to help folks recover from problems, including stuck orchestrations, without necessarily needing to wait on support tickets. It would be interesting to see whether or how this might work for sub-orchestrations too.

/cc @anthonychu

@thdotnet
Copy link
Author

Sorry, I'm late for the discussion... but I can provide more details if needed.

The rewind API is almost the same idea, however I would like the ability to reprocess any execution.

@anthonychu
Copy link
Member

This is a good idea. What's the advantage of restarting the same orchestration, vs starting a new one with the same input copied from the original orchestration? I'm not opposed to restarting but it does feel weird from an event sourcing perspective to delete history and start over.

@thdotnet
Copy link
Author

thdotnet commented Mar 1, 2020

Both would work. In fact, I think it should start a new one with some kind of traceability

@raffi1965
Copy link

It's an interesting feature idea for sure. The closest thing we have today is the Rewind API, which is designed to re-run only the most recent logic after a failure occurs. However, this is still in preview because there are a lot of edge cases where it doesn't work.

A restart API is interesting because it's conceptually very simple and would probably be easy to implement. Basically, we just need to query the input from the existing orchestration and then create a start message to restart it.

But given this, could you implement it yourself as well? For example:

[FunctionName("RestartOrchestration")]
public static async Task<HttpResponseMessage> RestartOrchestration(
    [HttpTrigger(AuthorizationLevel.Function, methods: "post", Route = "orchestrations/{instanceId}/restart")] HttpRequestMessage req,
    [DurableClient] IDurableClient client,
    string instanceId)
{
    DurableOrchestrationStatus status = await client.GetStatusAsync(
        instanceId,
        showHistory: false,
        showHistoryOutput: false,
        showInput: true);

    // TODO: Check the runtime status to make sure it's in a restartable state

    await client.StartNewAsync(
        orchestratorFunctionName: status.Name,
        instanceId: status.InstanceId,
        status.Input);

    return client.CreateCheckStatusResponse(req, instanceId);
}

client.GetStatusAsync always returns NULL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Feature requests.
Projects
None yet
6 participants