Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove BackgroundService dependency and find a better way to kickstart #18

Closed
linkdotnet opened this issue Apr 14, 2024 · 5 comments
Closed
Labels
help wanted Extra attention is needed Needs Design

Comments

@linkdotnet
Copy link
Member

linkdotnet commented Apr 14, 2024

In its current shape, the NCronJob library uses a BackgroundService to kickstart all the moving parts.
But it is just that—once all CRON jobs are kickstarted, BackgroundService will no longer be needed.

Before v1, a PeriodicTimer was in place to check every now and then what has to be scheduled and executed, but since v1, this has been replaced by a different approach that works better.

The way forward could be to sit on top of IApplicationBuilder (or any other suitable interface), so folks can run the Scheduler like this:

app.RunNCronJob();

The major question would be if one would need a "blocking" version so that RunNCronJob runs and blocks the current thread.
The use case would be a standalone background service whose only purpose is to run CRON jobs.

We could offer two versions:

app.RunNCronJob(); // Non-Blocking for environments like ASP.NET
await app.RunNCronJobAsync(); // Blocking when awaited

Of course other approaches are very welcome.

@linkdotnet linkdotnet added help wanted Extra attention is needed Needs Design labels Apr 14, 2024
@linkdotnet linkdotnet modified the milestone: v2 Apr 14, 2024
@falvarez1
Copy link
Member

falvarez1 commented Apr 16, 2024

I believe that BackgroundService still offers significant benefits, particularly for gracefully terminating jobs when you propagate the cancellation token throughout. Additionally, using IHostedService ensures that jobs are initialized and started early in the application lifecycle. With the introduction of HostOptions.ServicesStartConcurrently, initialization can occur concurrently, which enhances the system's overall efficiency. In this context, it might even be beneficial to consider a dedicated BackgroundService for each schedule.

I also noticed a potential issue in the current implementation where ScheduleJob recursively calls RunAndRescheduleJob, which then calls ScheduleJob again. This pattern might risk a stack overflow or resource leak, potentially leading to scaling challenges due to recursive stack growth. Moreover, because the job execution is not awaited, it complicates error handling and cancellation, hindering potential future retry capabilities.

I have some experience creating orchestration software within distributed systems, I would be happy to contribute further to refining this implementation. If it's agreeable to you, I can prepare a pull request for your review to address these concerns.

@linkdotnet
Copy link
Member Author

My argument against the BackgroundService is that it limits the usage of the library to some extent.
I could also expose the CronScheduler.ExecuteAsync(CancellationToken); method and give the control to the user itself.
Basically, BackgroundService is known for ASP.NET and (Windows) Services (or linux daemons, ...).

I am not strictly in favor of doing this (that is why I dropped my internal urgency to tackle that now), but I wanted to open up the discussion around the topic. To sum it up: What does feel naturally? And that differs if you are using WPF vs ASP.NET Core for example.

I do agree that this approach also has many upsides and that is why, initially, I took that approach (and still stick to it).
So thank you very much for weighing in a very educated opinion.

I also noticed a potential issue in the current implementation where ScheduleJob recursively calls RunAndRescheduleJob, which then calls ScheduleJob again.

That is somewhat by design. There should be no stack overflow, as I only schedule a continuation.
With that, the original function can return, and the stackframe is cleaned up. The new continuation receives its own stack.

It might need some polishing, and await might probably be the better approach here in the long term (as it can lead to GC issues as you mentioned).
The "beauty" of the current approach is that, thanks to the continuation, there isn't much of an overlap between the Scheduler/executor and the job itself. And yes, that comes with drawbacks - one, as you mentioned, is retry logic, which I ruled out as a feature initially, shifting the responsibility to the implementor of the job itself.

I have some experience creating orchestration software within distributed systems, I would be happy to contribute further to refining this implementation. If it's agreeable to you, I can prepare a pull request for your review to address these concerns.

Absolutely - that would be awesome. I do like to discuss the current design and I am very much open to ideas.

@linkdotnet
Copy link
Member Author

@falvarez1 I did some benchmarks and metrics (having a CRON job executing once per second over the course of minutes) and there is a tendency of increased memory (at least in the beginning).

I more leaning towards async/await as you said:

protected override Task ExecuteAsync(CancellationToken stoppingToken)
{
    var tasks = registry.GetAllCronJobs().Select(c => ScheduleJobAsync(c, stoppingToken));
    return Task.WhenAll(tasks);
}

private async Task ScheduleJobAsync(RegistryEntry entry, CancellationToken stoppingToken)
{
    while (!stoppingToken.IsCancellationRequested)
    {
        var now = timeProvider.GetUtcNow().DateTime;
        var runDate = entry.CrontabSchedule!.GetNextOccurrence(now);
        LogNextJobRun(entry.Type, runDate);

        var delay = runDate - now;
        await Task.Delay(delay, timeProvider, stoppingToken).ConfigureAwait(false);
        jobExecutor.RunJob(entry, stoppingToken);
    }
}

Something like this. That should align more with graceful handling when the server shuts down and should be less taxing on the heap.

@falvarez1
Copy link
Member

falvarez1 commented Apr 16, 2024

I see you made some new changes... Unfortunately I made a few changes myself to the same code, please review
https://github.com/linkdotnet/NCronJob/pull/20/files

This makes sure everything is awaited properly, cancellation is working and it supports parallel runs. I've added a new Job called TestCancellationJob to the NCronJobSample. Right now the easiest way to trigger a cancellation is to click Control+C on the console window. This will gracefully cancel each running instance.

@linkdotnet
Copy link
Member Author

Is obsolete with #21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed Needs Design
Projects
None yet
Development

No branches or pull requests

2 participants