Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Duplicate executions when using the default scheduling provider in single-node environment #5600

Closed
MariusVuscanNx opened this issue Jun 13, 2024 · 2 comments
Labels
bug Something isn't working
Milestone

Comments

@MariusVuscanNx
Copy link
Contributor

Description

Configuring the scheduling activities with the default provider (neither quartz nor hangfire), in a single-node environment has a few issues like duplicate executions when updating an existing workflow and errors thrown when unpolishing a workflow.

Steps to Reproduce

Scenario 1:
Unpublishing and publishing again will cause duplicate executions. Basically the first triggers will not be removed.

Scenario 2:
Changing the value of the cron will have the same effect as scenario 2.

Scenario 3:
Unpublishing a workflow will stop the timer but it is done by accident. The application will keep trying to execute the workflow but since it is unpublished it will throw this error:

fail: Elsa.Mediator.HostedServices.BackgroundCommandSenderHostedService[0]
      An unhandled exception occured while processing the queue
      System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
       ---> System.AggregateException: One or more errors occurred. (Specified workflow definition and version does not exist)
       ---> System.Exception: Specified workflow definition and version does not exist
         at Elsa.Workflows.Runtime.Services.DefaultWorkflowRuntime.CreateWorkflowHostAsync(String definitionId, StartWorkflowRuntimeParams options, CancellationToken cancellationToken)
         at Elsa.Workflows.Runtime.Services.DefaultWorkflowRuntime.StartWorkflowAsync(String definitionId, StartWorkflowRuntimeParams options)
         at Elsa.Workflows.Runtime.Services.DefaultWorkflowRuntime.TryStartWorkflowAsync(String definitionId, StartWorkflowRuntimeParams options)
         at Elsa.Workflows.Runtime.Handlers.DispatchWorkflowCommandHandler.HandleAsync(DispatchWorkflowDefinitionCommand command, CancellationToken cancellationToken)
         at Elsa.Mediator.CommandStrategies.DefaultStrategy.ExecuteAsync[TResult](CommandStrategyContext context)
         --- End of inner exception stack trace ---
         at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
         at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)
         at InvokeStub_Task`1.get_Result(Object, Object, IntPtr*)
         at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)
         --- End of inner exception stack trace ---
         at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)
         at System.Reflection.PropertyInfo.GetValue(Object obj)
         at Elsa.Mediator.Middleware.Command.Components.CommandHandlerInvokerMiddleware.InvokeAsync(CommandContext context)
         at Elsa.Mediator.Middleware.Command.CommandPipeline.InvokeAsync(CommandContext context)
         at Elsa.Mediator.Services.DefaultMediator.SendAsync(ICommand command, ICommandStrategy strategy, CancellationToken cancellationToken)
         at Elsa.Mediator.HostedServices.BackgroundCommandSenderHostedService.ReadOutputAsync(Channel`1 output, CancellationToken cancellationToken)

The issues can be reproduced every time.

The startup configuration:
.UseScheduling()

Cron expressions used:

  • */20 * * * * *
  • */10 * * * * *

Expected Behavior

The expected behaviors are:

  • When you republish a workflow, you won't get duplicate executions. The previous triggers are removed.
  • When updating the cron value, the old triggers should be removed/be overridden by the new ones.
  • When you unpublish a cron workflow, instances are not being created anymore and no errors are thrown.

Workaround

Using the quartz provider is working as expected, without the issues mentioned.

Environment

  • Elsa version used: 3.2.0-preview.1616
  • Mass transit was disabled.
  • Default runtime
@MariusVuscanNx MariusVuscanNx added the bug Something isn't working label Jun 13, 2024
Copy link

glime-ai bot commented Jun 13, 2024

Based on the provided context, it seems that you are encountering issues with the default scheduling provider in Elsa 3.2.0-preview.1616. Here are the steps you can take to address the issues you've described:

  1. Duplicate Executions on Republishing: When you republish a workflow, the existing triggers should be removed to prevent duplicate executions. This might be a bug in the current version you are using. You should check if there are any open issues on the Elsa GitHub repository or consider reporting this as a new issue.

  2. Updating Cron Value: Similar to the first issue, when you update the cron value, the old triggers should be removed or overridden by the new ones. If this is not happening, it could also be a bug. Again, checking the GitHub issues or reporting a new one would be the recommended course of action.

  3. Errors on Unpublishing a Workflow: If unpublishing a workflow is causing errors due to attempts to execute the workflow, this is likely a bug. Ensure that your workflow cancellation and cleanup logic is correctly implemented. If the issue persists, report it on GitHub.

  4. Workaround: You mentioned that using the Quartz provider works as expected. If the default provider is not meeting your needs due to these issues, consider switching to Quartz.NET or Hangfire as a more stable alternative until the issues with the default provider are resolved.

  5. Environment: Since you are using a preview version of Elsa, it is possible that you may encounter bugs that have not been identified or fixed yet. Always ensure you are using the latest stable version, or if you are using a preview version, keep an eye on the release notes for bug fixes related to your issues.

  6. Mass Transit Disabled: If you have disabled Mass Transit, make sure that any related configuration or dependencies are also correctly adjusted to avoid conflicts or issues with the workflow execution.

  7. Default Runtime: If you are using the default runtime and encountering these issues, consider exploring the distributed runtime options that Elsa offers, such as using Proto.Actor for a distributed, lock-free workflow execution across multiple nodes.

If you need further assistance, you can reach out to the Elsa community or maintainers through their GitHub repository or official communication channels.

This comment was generated by Glime.

@sfmskywalker sfmskywalker moved this to Todo in ELSA 3 Jun 17, 2024
@sfmskywalker sfmskywalker added this to the Elsa 3.2 milestone Jun 17, 2024
@sfmskywalker sfmskywalker moved this from Todo to In Progress in ELSA 3 Jul 18, 2024
@sfmskywalker
Copy link
Member

Fixed via e5ba9a8

@github-project-automation github-project-automation bot moved this from In Progress to Done in ELSA 3 Jul 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

2 participants