-
Notifications
You must be signed in to change notification settings - Fork 445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a third-party library for a queuing tasks #4622
Comments
I am adding this as a preliminary work on Queues (and some support for Artisan) https://github.com/defstat/pkp-lib/tree/laravel-queue-integration |
Thanks @defstat! Looks like a lot of tough integration work here. The queuing looks solid and will be a big improvement for us. It seems like the last big challenge here is figuring out how to process jobs in the queue. If I understand correctly, Laravel's workers will work through the jobs as quickly as possible whenever they detect jobs in the queue. I'm having trouble thinking about how we can do this in a way that works for the existing cron and acron setups. For those running a cron setup, it should be possible to burn through the queue whenever the cron job is fired. However, our own documentation recommends a cron job to be run daily. For the kind of tasks that we want to queue, we really need the jobs to be executed within a minute or ideally within seconds. Jobs may be delayed when the stack grows large. But we don't want things like emails, XML conversions, etc, to be run once a day. And if a cron job tries to burn through all of the jobs once a day, we're likely to run into the performance problems that the queue is meant to solve. I guess the solution is to have the cron job initialize Laravel's workers and let them burn through the queue. But I'm still not clear on how this works. If we run the cron job regularly, are we going to end up with a lot of long-running workers that never get shut off? Laravel itself suggests that for the workers to run well, a separate process is needed to monitor the workers and restart them if they stop. The acron setup seems to have the same difficulties. Either we end up hammering one request with the need to process all jobs, or we use the request to start a worker and end up with lots of workers running. Maybe all of this is a solved problem. I'm definitely out of my depth on this one. But just trying to figure out how we get there. cc @asmecher |
Wouldn't it be the case for a single frequent cron job and using Scheduling for large tasks that can run infrequently? In this way the frequency settings for these major tasks would leave cron and go to OJS. |
We can recommend people change their cron to be more frequent. However, in practice we are likely to find that a lot of instances are upgraded without making this change. |
In this case, something similar to the new version notice may be interesting when OJS realizes that it has spent more time than recommended to run cron. So OJS itself alerts its users, when there is this forgetfulness in the upgrade or any problem in Cron. |
- Configure global queue manager - Require minimal Composer packages - Move __(...) earlier in bootstrap to get ahead of Laravel equivalent
- Configure global queue manager - Require minimal Composer packages - Move __(...) earlier in bootstrap to get ahead of Laravel equivalent
- Configure global queue manager - Require minimal Composer packages - Move __(...) earlier in bootstrap to get ahead of Laravel equivalent
- Configure global queue manager - Require minimal Composer packages - Move __(...) earlier in bootstrap to get ahead of Laravel equivalent
While running an audit on a WordPress site, I learned about this library: https://github.com/woocommerce/action-scheduler/. I don't think we'd opt for it in place of Laravel's task queue, but it would be interesting to study it. It looks like it can handle really large queues and if it is running in WooCommerce, that means it is running in lots of the kinds of limited hosting environments that will make processing the queue tricky for us too (when workers can't be run). |
Have found this post telling us about how to run the Queue on shared hosts, it's pretty similar to our edge case scenario when someone doesn't have access to the crontab. https://orobogenius.medium.com/laravel-queue-processing-on-shared-hosting-dedd82d0267a |
@henriqueramos, that's useful for hosts that can't run worker threads, but does still depend on cron access. |
It's a WIP yet, but the PR for pkp-lib it's located here https://github.com/pkp/pkp-lib/pull/7049/files |
Thanks, @henriqueramos, I've had a quick look. A couple of pieces of feedback...
|
…runScheduledTasks to use the SchedulerBag from ServiceProvider
@asmecher Currently the With this migration to the Also, I've implemented a new Service Provider to be extended, giving us more flexibility to handle new Scheduled Tasks. |
I'll defer to @asmecher on this. But in my opinion it's too late in 3.4's dev cycle to be introducing new architectural patterns. At the very least, a new architectural pattern should be accompanied by a clear strategy to refactor a whole module or subsystem. Even if we can't refactor the whole thing in one version, adopting a new architectural pattern needs to be more than a one-off. If these patterns are good to use, I'd prefer to see a comprehensive proposal made for how we convert all of our CLI tools to the new pattern. Then we can discuss and schedule this work against 3.5 or another version where we have the time to fully consider and adopt the new pattern. Otherwise, we end up with a codebase full of one-off architectural ideas that were never rolled out across the system. It becomes very hard to understand and maintain them. Each new architectural pattern has its own learning curve, so if we're going to adopt a new one, it needs to replace an old pattern so that we aren't burdening devs with multiple patterns for the same task. |
@henriqueramos, I agree with @NateWr. Every new pattern needs to be learned and that represents a cognitive load for the dev team; when a new pattern coexists alongside an older pattern, developers need to understand both patterns, the way they relate to each other, and what the plans are to replace the older one. There are a lot of cases like this in the codebase and we need to be very careful about adding more. So it's incumbent on the developer introducing new patterns to do the homework rather than letting it accumulate as technical debt. In my opinion tying tasks into the I'd recommend keeping your scope focused on the problem at hand -- replacing the scheduler toolset with a 3rd-party implementation -- rather than adding rearchitecting to it. You'll find there are enough complications to consider, e.g. the way plugins register tasks, and keeping it focused will be friendlier to the other devs who are working on major forks at the moment. |
@asmecher Having a This is the implementation on pkp-lib and this is the implementation on OJS. And in a plugin, on the plugin's
Also we could remove the legacy ScheduledTask stuff (like XML Parsing) from the entire codebase. |
Out of curiosity, what's missing to close this issue? By skimming the code, I'd say it's missing:
|
Yeah this should probably be closed in favor or smaller issues for the remaining tasks.
|
Anyone willing to file the other issues? |
It looks like #7171 added a UI to view jobs, but it hasn't yet captured the use case described here: #7171 (comment) |
Ok, I've created a couple of new issues: I decided not to create an issue for progress tracking, because we don't yet have a need for it. I'm going to close this issue as done and we can continue working on the other issues. |
We could use a way to queue tasks, run them in order, and track their status. This is most needed for the document conversion toolset. But we could also use wherever we have one or more lengthy tasks and want to queue a follow-up when it's done.
Some use cases:
Our
ScheduledTask
approach does some of this already for regularly scheduled tasks. It seems that most libraries separate queuing from scheduling, so we'll have to do some thinking about how to refactor those tasks. "Queuing" means putting a job into a queue and it will be run when all previous jobs are done. "Scheduling" is about when to add a job to the queue.Our priorities:
I did a quick search and didn't find many options. I may be looking under the wrong terminology. However Laravel Queues v5.5 is compatible with PHP 7.0. It handles job queuing but not scheduling. Here is a demo of how to use it stand-alone. Laravel's scheduling component does not appear to have a stand-alone example, so we may have to roll that part ourselves.
On the plus side, it supports a lot of third-party queue systems (Beanstalk, Amazon SQS, Redis) in addition to our db, so if high-volume users really want to offload some of these tasks they can do so
Related issues: #2550, #4517.
The text was updated successfully, but these errors were encountered: