Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #21802 - Make recurring tasks more resilient #311

Merged
merged 13 commits into from
Mar 27, 2018

Conversation

adamruzicka
Copy link
Contributor

@adamruzicka adamruzicka commented Jan 17, 2018

@@ -29,6 +29,15 @@ def cancellable?
end

def cancel
if execution_plan && execution_plan.state == :scheduled
Copy link
Member

@iNecas iNecas Jan 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if, for some reason the cancelling was not successful? Let's consider that the currently planned task would be actually finishing the run and until we would get to the execution_plan!.cancel.any? method, it would already finish, including planning the next round of the action? It might be a corner case, but we should probably count on the worst case scenario here.

It seems that what we are missing in dynflow is ability to hook into the lifecycle of the action to perform some additional logic. I'm thinking of something like

class MyAction < Dynflow::Action
  execution_plan_hooks.on_stop do
     # do something when the execution plan stops
  end

  execution_plan_hooks.on_pause do
     # do something when the execution plan got paused (regardless of what step caused the pause)
  end

  execution_plan_hooks.on_success do
     # do something when the execution plan finished successfully
  end

  execution_plan_hooks.on_fail do
     # do something when the execution plan got stopped with failed state (:error, :warning or :cancel)
  end
end

In the first phase, we would probably not need to make this extendable by action, but at least a way to hook into the lifecycle of the execution plan to things like this would be quite beneficial and would allow to do things like this easier that they are right now, where we need to use things like finalize for it, which is not always the best place to put the logic, especially when dealing with handling exceptional state.

@@ -0,0 +1,34 @@
module Hooks
class TriggerRepeat < ::Dynflow::ExecutionPlan::Hooks::Abstract
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this hook registered?

Copy link
Contributor Author

@adamruzicka adamruzicka Feb 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# Hook to be called when a repetition needs to be triggered. This either happens when the plan goes into planned state
# or when it fails.
def trigger_repeat(kind, execution_plan)
if (kind == :planned || execution_plan.result == :error) && execution_plan.delay_record && recurring_logic_task_group
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the situation that (kind == :planned || execution_plan.result == :error) would not be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, now that I'm thinking about it it should not happen, unless someone explicitly adds another execution_plan_hooks.use :trigger_repeat line to the action

@iNecas
Copy link
Member

iNecas commented Mar 15, 2018

Tests are failing now. Also, I guess the the bump on dynflow version requirement to v0.8.36 (if I'm looking properly on what version added the hooks) is needed.

No other comments code-wise. @adamruzicka could you work with @jhutar testing/validation of the behavior?

self.started_at = string_to_time(utc_zone, data[:started_at]) unless data[:started_at].nil?
self.ended_at = string_to_time(utc_zone, data[:ended_at]) unless data[:ended_at].nil?
self.start_at = string_to_time(utc_zone, data[:start_at]) if data[:start_at]
self.start_before = string_to_time(utc_zone, data[:start_before]) if data[:start_before]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These things are needed since the dynflow schema normalization went it, maybe it would warrant a separate PR?

@adamruzicka
Copy link
Contributor Author

@jhutar feel free to ping me whenever you're ready to take a look at this

@iNecas
Copy link
Member

iNecas commented Mar 27, 2018

While testing, I've found one issue:

When cancelling a job, the next job is scheduled, but the 'next occurrence' field still stays at the previous value, compare:

rex-delayed-resilience-1
rex-delayed-resilience-2

Otherwise looks good

@iNecas
Copy link
Member

iNecas commented Mar 27, 2018

@iNecas
Copy link
Member

iNecas commented Mar 27, 2018

Tested and works well, waiting for tests to finish

@iNecas
Copy link
Member

iNecas commented Mar 27, 2018

Tests are passing. No need to wait more. Thanks @adamruzicka

@iNecas iNecas merged commit 6d0d961 into theforeman:master Mar 27, 2018
@adamruzicka adamruzicka deleted the fix/21802 branch March 28, 2018 07:58
@jhutar
Copy link

jhutar commented Jun 28, 2018

Direct link to the issue: https://projects.theforeman.org/issues/21802

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants