Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit and embrace concurrency control, scheduled jobs, and other extensions of ActiveJob #255

Closed
bensheldon opened this issue May 14, 2021 · 7 comments
Projects

Comments

@bensheldon
Copy link
Owner

bensheldon commented May 14, 2021

I attended RailsConf 2021 and in the discussions I had with other ActiveJob adapter maintainers and users, I'm re-considering my prior objections to extending ActiveJob.

  • I now perceive ActiveJob core development to be more conservative than previously/wishfully, and that ActiveJob's interface/capabilities will continue to trail (rather than lead) that of the adapter ecosystem.
  • Other adapters seem reluctant to add differentiating features (like concurrency control) for technical reasons that don't apply to GoodJob.

I also am inspired by Stephen Baker's History of RSpec:

And finally, I regret putting distance between myself and the thing I’d created. I think I could have built more useful tools in the space, and helped more people in bigger ways if I was a bit more accepting of the fact that people were using a thing I created for purposes other than what I intended it for. I am sorry for not helping more people.

I still believe that opinionated product-management is core to successful open-source work (technically and humanely), but I also have heard from people that there are differentiating ActiveJob extensions that would help them if incorporated into GoodJob. I am still fearful that by extending ActiveJob's interface we will make mistakes that will be painful to correct... but optimistically, those mistakes can be ours to make together.

Two clear requested capabilities are:

  1. Concurrency controls: Custom advisory locks to prevent certain jobs from being worked on concurrently? #206, Question about locking internals #212.
  2. Scheduled Jobs: Add the ability to schedule repeating / recurring / cron-like jobs #53, Add the ability to schedule recurring jobs #75, support cronjob style for jobs also #129.

I will re-open a representative issue for further discussion and design of those capabilities.

Please use this Issue/thread if you have thoughts about this decision, or if there are other capabilities/features that you think would be helpful additions to GoodJob in light of this change in approach.

Thank you for being on this journey with me and GoodJob!

@reczy
Copy link
Contributor

reczy commented May 14, 2021

@bensheldon - probably not much of a surprise, but I am 100% in favor of this revised approach, particularly in light of your recent learnings at RailsConf. I have been consistently impressed with the way you're managing this gem (responsiveness to issues, thoughtfulness of approaching suggestions, regularity of updates, etc) - thank you for your continued hard work!

Coincidentally, scheduled jobs is probably my second-most desired feature from this gem (after concurrency control, which was my own request), so I'm extremely happy to see these two features specifically mentioned here. If you asked me to mention a third, it would probably be related to the web ui functionality. I know it has come up before and that you opined on various suggestions (#220, #50), but is this something that you would reconsider in light of this new approach? For clarity, I am specifically referring to the web ui providing administrative functionality (manually retry a job, delete a job that is queued for first run or retry, etc).

Separately, as part of this broader discussion, I wanted to bring up something you mentioned in an earlier issue #220 (comment):

I'm of mind that any part of GoodJob I'm developing be part of a monolithic gem. Obviously I can't control if someone wants to ship their own add-on gem, but I'd encourage folks to contribute improvements to core.

In light of your new approach, is this still the case? Monolith or not, though, perhaps it's worth thinking about developing a set of job and worker hooks that GoodJob differentiating features (ie, non-core ActiveJob) can use. See https://github.com/resque/resque/blob/master/docs/HOOKS.md, for example.

Potential pros of implementing differentiating features through hooks:

  • It would be easier to modularize those features and keep them self-contained (for example, each differentiating feature could be in its own file or folder in a GoodJob plugins directory). I suppose some differentiating features might still need to be intermixed with core GoodJob code
  • It might be easier to adapt certain differentiating features to ActiveJob updates
  • It might make deprecation a bit easier if ActiveJob implements an identical or competing feature
  • It would likely make the process of creating plugins easier and more approachable for the community. I tend to think this might be the most convincing pro
  • It might allow easier "ownership" of certain plugins by community contributors (not saying that you give them commit rights, but rather that it makes it easier for someone to continue making siloed contributions to a particular plugin that they developed or that's important to them)
  • It would make it easier for developers to continue adding functionality on top of GoodJob even during times when you lack capacity to review and merge PRs (you're on vacation, you're too busy, etc.) without having to monkey patch GoodJob

Other considerations:

  • Developers may be more likely to develop plugins as separate gems. That's certainly a con if you generally want to maintain GoodJob as a monolith. I'm not sure what the right approach would be here, but I suppose you could offset the addition of hooks with a statement from the outset noting your preference???
  • Companies may be more likely to develop proprietary / closed source plugins on top of GoodJob. You may view this as positive or negative, but either way, I think the ability to do so could only increase GoodJob adoption

I think in the course of writing this I've convinced myself to be pro-hooks, but this is not my gem :). What are your thoughts?

One additional thought I had is what's the best way to save data that's necessary for the implementation of various plugins? For example, let's say for concurrency GoodJob needed to save the text to be md5 hashed for postgres locks. My initial proposal is that you create one additional jsonb column dedicated to any and all plugin functionality. From there, each plugin could potentially informally claim a particular key prefix (if you want to keep the jsonb object flat) or a particular first-level key (if you think nested objects are a better approach).

So, for example, the concurrency plugin might lay claim to the key prefix concurrency_ or the first-level key concurrency as in {concurrency_hashable_text: "account_1"} or {concurrency: {hashable_text: "account_1"}}. With either scheme, you could reserve a particular namespace for yourself, such as gj.

While you lose certain benefits from using separate columns, with this approach, existing users only need to perform a single migration to enable any number of plugins going forward.

@reczy
Copy link
Contributor

reczy commented May 14, 2021

Since you asked, one other feature I will throw out there is the ability to create and manage a pool of GoodJob processes - ie, add processes instead of threads. Previously with resque I used https://github.com/nevans/resque-pool

As a potentially helpful frame of reference, resque-pool looks like it's a top 5 downloaded resque plugin outside of scheduling and retrying.

This would make GoodJob more flexible with applications where thread-safety is a concern.

@bensheldon
Copy link
Owner Author

@reczy thank you for the kind words! And the suggestions are great:

I agree about the hooks. GoodJob is already instrumented with ActiveSupport Notifications, but I'm not confident it's done well, nor how to document it, nor what kind of SemVer contracts are reasonable. Thank you for sharing the Resque docs. I feel a little within a chicken-and-egg spot right now: I want GoodJob to be an appealing target for someone writing an extension AND I don't want to slow down GoodJob's development with a speculative interface. Which is why, right now, I think being a friendly and responsive maintainer is the appropriate interface 😄

I like your idea about providing an extendable data-model. I wonder if it would be possible to hook into the existing jsonb serialized_params column. I do wonder about indexes (for example, concurrency control will need indexes). GoodJob hasn't made any database-migration-requiring updates yet and I am somewhat challenged (good challenged) about how to effectively communicate and create tooling for that.

Overall, I do still lean into the idea of a monolithic gem. I think that's the best way to achieve a top-level goal of Rails compatibility. And colored by my own not great experience of managing DelayedJob and Que plugins and their dependencies.

I appreciate you contributing to GoodJob! Thank you.

@reczy
Copy link
Contributor

reczy commented May 15, 2021

I like your idea about providing an extendable data-model. I wonder if it would be possible to hook into the existing jsonb serialized_params column. I do wonder about indexes (for example, concurrency control will need indexes).

I'm glad you wrote that. Having slept on it, I separately came to the same conclusion this morning. I would guess that using something within a jsonb column for concurrency (even with a good GIN index) would likely be slower than having a separate column dedicated to the lock key. Worse, it would probably make for some really hard-to-understand code. We can bring the conversation over to the concurrency issue, but briefly, saving the key in an indexed bigint column, for example, at job creation could be a potential solution and would mean no more md5 stuff in any of the sql in Lockable. I guess my point is that there's probably not a one-size-fits-all solution for extension data storage like I had proposed above.

GoodJob hasn't made any database-migration-requiring updates yet and I am somewhat challenged (good challenged) about how to effectively communicate and create tooling for that.

I don't have anything specific to suggest with respect to this, but perhaps looking at the way devise handles migrations and the user table can provide some comfort? They have their core columns that they use by default, then if you want to add say Confirmable, you either uncomment the relevant lines on your first devise migration or you can add those columns in at a later time with a later migration. Since devise is so ubiquitous within the rails ecosystem, I think the vast majority of rails developers would at least be familiar (if not comfortable) with that practice. You could always perfect the tooling over time, in reaction to the issues folks are having with migrations.

Overall, I do still lean into the idea of a monolithic gem. I think that's the best way to achieve a top-level goal of Rails compatibility. And colored by my own not great experience of managing DelayedJob and Que plugins and their dependencies.

I excluded my opinion from my original response above, but I agree with you.

@leehericks
Copy link

I just found GoodJob and am reading up on it. Don't you think it's worth pushing the Rails team to improve ActiveJob rather than implement outside of the standard? Did someone at RailsConf say they won't change it or is it just that no one has stepped up to do the work?

@bensheldon
Copy link
Owner Author

bensheldon commented Jun 8, 2021

@leehericks Thanks for the interest!! This is where my head is still at:

  • I now perceive ActiveJob core development to be more conservative than previously/wishfully, and that ActiveJob's interface/capabilities will continue to trail (rather than lead) that of the adapter ecosystem.
  • Other adapters seem reluctant to add differentiating features (like concurrency control) for technical reasons that don't apply to GoodJob.

That's still my perception of the situation; no one has explicitly said "Hell no, strong block".

I'm focused on GoodJob, supporting proposals on Rails (like rails/rails#40337), and trying to drive discussion where I can (like https://discuss.rubyonrails.org/t/future-of-first-party-first-generation-activejob-adapters/75931).

If you're looking for somewhere to make contributions, I think any of these areas could use support and cultivation.

@bensheldon
Copy link
Owner Author

Closing this, but please be welcome to continue discussing this.

Backlog automation moved this from Inbox to Done Aug 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Backlog
  
Done
Development

No branches or pull requests

3 participants