Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Job Excution to be pin to a cluster node #344

Closed
zemian opened this issue Feb 13, 2019 · 14 comments
Closed

Allow Job Excution to be pin to a cluster node #344

zemian opened this issue Feb 13, 2019 · 14 comments
Labels
stale Inactive items that will be automatically closed if not resurrected

Comments

@zemian
Copy link
Contributor

zemian commented Feb 13, 2019

It would be nice to able to configure certain specific jobs to allows run in a specific host name in cluster mode.

Currently users would have to write this logic into their job implementation to skip the work if host names match to some known list.

Other work around is not to use cluster env and place jobs into individual standalone schedulers.

@HiranChaudhuri
Copy link

HiranChaudhuri commented Feb 13, 2019

Thank you for opening that feature request. I'd implement this slightly different but basically this is the functionality that I need, therefore
+1

What I'd do is

  1. assign labels to scheduler-instances (every instance can have multiple labels)
  2. specify required labels, preferred labes or not-wanted labels on the triggers/jobs (every trigger or job can have multiple labels)
  3. Quartz needs to match the labels when choosing a trigger to fire.
  • So if the instance has a label that is not-wanted for a trigger/job, ignore that trigger/job
  • If the instance does not have a label that is required by the trigger/job, ignore that trigger/job
  • If the instance does not have a label that is preferred by the trigger/job, wait for X seconds (allow one of the preferred nodes to pickup the trigger/job. If that does not happen, still run it)

This will allow to declare a job should run on one host, on a group of hosts, preferrably on some host or exclude specific hosts, meaning you can set affinity, anti-affinity or a bias.

@HiranChaudhuri
Copy link

HiranChaudhuri commented Feb 14, 2019

Even better might be to

  1. extend the scheduler with a strategy which trigger/job to pick next (configurable through quartz.properties)
  2. implement the above algorithm as one of the strategies

@zemian
Copy link
Contributor Author

zemian commented Feb 15, 2019

Hi @HiranChaudhuri, The job+ triggers itself already has key, and these can be the labels. I think what we need is be able to specify keys to be pin/bound to a threadPool to be process. This will solve this problem and require minimum changes.

@HiranChaudhuri
Copy link

Hi @HiranChaudhuri, The job+ triggers itself already has key, and these can be the labels. I think what we need is be able to specify keys to be pin/bound to a threadPool to be process. This will solve this problem and require minimum changes.

Sound good for me. So I am looking forward for the next release... :-)

@mederly
Copy link

mederly commented Feb 21, 2019

Hello @HiranChaudhuri. As @zemian says we have had similar requirements described in #175; maybe a bit stronger as our customers require setting also thread limits per job types. The implementation I've made seems to work quite well in production for almost two years: https://github.com/Evolveum/quartz/tree/quartz-2.3.0.e2 (except for occasional https://jira.evolveum.com/browse/MID-4558 that looks harmless but should be resolved anyway, eventually).

@sajinieKavindya
Copy link

Hi @zemian . We use quartz scheduler and we are in a need of the feature #344. Would you please let me know the current status of this requested feature? If it is going to be included in the next release, appreciate if you can give me a hint of the next release date. =)

@mattiacirioloWS
Copy link

Hi all, we are interested in this feature too.
We think it might be interesting to add to the "Trigger" interface the "startHere()" and "startAlwaysOnNode(nodeId)" methods to start the execution on the node on which the Job is added to the Scheduler or on the specified node.
What do you think about?

Thanks and best regards.

@mederly
Copy link

mederly commented Sep 10, 2019

@mattiacirioloWS I am not quite sure. The reasons are:

  1. The method should be written in the language of execution groups, not nodes -- at least if it has to match current implementation of the pinning feature. So the method should be probably like setExecutionGroup(group).
  2. Methods that provide similar functionality are currently in the Scheduler interface. So I think that this method belongs there.

@HiranChaudhuri
Copy link

With so many good ideas (some of which contradict each other), would not not make sense to implement a strategy pattern (https://en.wikipedia.org/wiki/Strategy_pattern) and provide some default algorithms? That way everyone could easily use what is provided or override for further customization.

@mederly
Copy link

mederly commented Sep 11, 2019

Strategy pattern is generally a good idea, although I am not quite sure it is applicable here. The method of "pinning" a job to a given node is quite dependent on appropriate DB representation. Current representation (in Evolveum-provided work) is using EXECUTION_GROUP column. So there's not much room for specifying the behavior using the Strategy pattern here.

@mattiacirioloWS
Copy link

@mattiacirioloWS I am not quite sure. The reasons are:

  1. The method should be written in the language of execution groups, not nodes -- at least if it has to match current implementation of the pinning feature. So the method should be probably like setExecutionGroup(group).
  2. Methods that provide similar functionality are currently in the Scheduler interface. So I think that this method belongs there.

Thanks for your reply, @mederly
I agree with you on point 1.
As for point 2, I think that the Scheduler is common to all Jobs. Setting the execution group on the Scheduler means to define different scheduler also in the case we want that only one of the jobs in the system is executed on single node (or group of nodes) and not on the whole cluster, doesn't it?
In my opinion, a good place to define exceptions to the functioning of the Scheduler could be the Trigger, which defines the execution strategy for the single Job. What do you think about?

Thank you so much for support 👍

@HiranChaudhuri
Copy link

Strategy pattern is generally a good idea, although I am not quite sure it is applicable here. The method of "pinning" a job to a given node is quite dependent on appropriate DB representation. Current representation (in Evolveum-provided work) is using EXECUTION_GROUP column. So there's not much room for specifying the behavior using the Strategy pattern here.

Hmm, you mention DB representation which directly resolves into the available columns.
How about I have an algorithm that decides based on execution time? How about existing columns are sufficient for my logic? (see my comment in this thread from Feb 13 about usage of labels). How about I decide for the execution node just based on data outside the scheduler?

I do not believe every strategy would have to have a representation in the scheduler's data model. Leave that to the implementation of the strategy.

@HiranChaudhuri
Copy link

As for point 2, I think that the Scheduler is common to all Jobs. Setting the execution group on the Scheduler means to define different scheduler also in the case we want that only one of the jobs in the system is executed on single node (or group of nodes) and not on the whole cluster, doesn't it?
In my opinion, a good place to define exceptions to the functioning of the Scheduler could be the Trigger, which defines the execution strategy for the single Job. What do you think about?

I do not mind too much if the logic goes into the Scheduler or Trigger class. However I never really understood the difference of JobListener.jobExecutionVetoed and TriggerListener.vetoJobExecution. So far I just thought once a job is vetoed the trigger event is lost - it will not be picked up by another Scheduler in the same cluster.

@stale
Copy link

stale bot commented Aug 3, 2021

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Inactive items that will be automatically closed if not resurrected label Aug 3, 2021
@stale stale bot closed this as completed Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Inactive items that will be automatically closed if not resurrected
Projects
None yet
Development

No branches or pull requests

5 participants