Schedulers documents are not cleared #136

gmiejski · 2016-12-10T14:58:00Z

I've come on a bug [or not implemented feature?] where scheduler documents are not cleared from mongo, leading to new entries with each deployment - seems trivial but it's harder but can make some things properly (besides growing collection) - like monitoring of how many active schedulers there are.

I propose 2 solutions - either make other cluster instances scan schedulers collection and periodically clean old ones (or make it configurable) or simply change/add a field "lastCheckingDate" into the schedulers objects (with Date type) - then one can simply add a TTL index to mongo, and everything would be fine.

Please tell me if I hadn't noticed something important, or maybe creating a TTL index for old, inactive cluster instances would break some stuff I'm not aware of. (but placing TTL index for like 1 hour would probably not break such things - still not sure how about lock or that kind of stuff)

Please share your thoughts!

michaelklishin · 2016-12-10T18:49:48Z

A field with last scheduler activity sounds good to me. When would it be set, however?

gmiejski · 2016-12-10T22:21:20Z

I was thinking that this lastCheckingDate could be calculated from lastCheckinTime and simply stored applied in SchedulerDao.createUpdateClause().

However, there are two points I have just found out:

When you recover state, you always recover state by the same instanceId - which won't work with AUTO generated instancedIds in CLUSTERED mode - we should also recover old clusterInstances, their lock and clear those too.
There seems to be some kind of a bug, because I can see growing number of trigger locks, without corresponding triggers - have you come across such thing?

But I have studied a bit how it is implemented in original Quartz, and they clear old records during checkingIn - when they also recover old triggers.

Considering the clustered mode, the simplest solution seems to be not-the-best-option.
I would go for recovering old cluster states during checking in, as it is done in Quartz JobStoreSupport, what do you think? How about clearing all locks acquired by non-active instances together with a document in _schedulers for not-checking-in instance?

otlg · 2021-01-26T17:33:37Z

Hi,

Seems the _schedulers collection is never cleaned up. Might be problematic in k8s for stateless services.
Any plan to solve it?

Thanks

michaelklishin · 2021-01-27T19:10:43Z

Hi,

Seems the _schedulers collection is never cleaned up. Might be problematic in k8s for stateless services.

Any plan to solve it?

Thanks

This is open source software. You are welcome to contribute a fix.

otlg · 2021-02-01T10:11:28Z

If there is no plan and there is no other workaround, I can contribute a fix.
Seems strange that the issue is opened for 6 years and still no fix for it.

gmiejski · 2021-02-01T10:36:51Z

As far as I remember, the issue seemed easy to fix, but during implementation I've came across more complicated details, which I've described above. I'm not using this cool library anymore, so I won't be able to help, but keeping my fingers crossed 🤞

michaelklishin added the help wanted label Mar 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schedulers documents are not cleared #136

Schedulers documents are not cleared #136

gmiejski commented Dec 10, 2016

michaelklishin commented Dec 10, 2016

gmiejski commented Dec 10, 2016

otlg commented Jan 26, 2021

michaelklishin commented Jan 27, 2021

otlg commented Feb 1, 2021

gmiejski commented Feb 1, 2021

Schedulers documents are not cleared #136

Schedulers documents are not cleared #136

Comments

gmiejski commented Dec 10, 2016

michaelklishin commented Dec 10, 2016

gmiejski commented Dec 10, 2016

otlg commented Jan 26, 2021

michaelklishin commented Jan 27, 2021

otlg commented Feb 1, 2021

gmiejski commented Feb 1, 2021