-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hangfire Job starts multiple times--Same issue is experienced as #590 #1025
Comments
@YukaAn, what storage are you using? Could you include the full details, e.g. the |
@odinserj I'm using sql server 2008, Hangfire version is 1.6.6. Below is a screenshot of a job state info: |
Could you show me your configuration logic and recurring method's signature? |
@odinserj Thank you for the quick response! The configuration logic looks like this: public void Configuration(IAppBuilder app) {
GlobalConfiguration.Configuration.UseSqlServerStorage("HangfireDb");
app.UseHangfireDashboard();
app.UseHangfireServer();
} the recurring methods like this: public void CreateRecurringJob(int hour, int minute, int Id, string Name, string occurence)
{
try
{
if(!MinuteCheck(minute) || !HourCheck(hour) || !CronCheck(occurence))
{
return;
}
string cron = BuildCron(hour, minute, occurence);
if(IsExistingOrNewMethod(Id, Name))
{
ScheduledJobHandler handler = new ScheduledJobHandler();
RecurringJob.AddOrUpdate(
JobNameBuilder(Id, Name),
() => handler.SendRequest(Id, Name),
cron,
TimeZoneInfo.FindSystemTimeZoneById("Eastern Standard Time")
);
}
}
catch(Exception ex)
{
throw ApiException(ex);
}
} |
@odinserj I have around 70 recurring jobs scheduled each day and this issue keeps happening couple times every day (randomly on different jobs). I'm still waiting for your reply and I appreciate your help. Thanks! |
@YukaAn, sorry for the delay. Try to upgrade to the latest version. At least Hangfire.Core 1.6.12 has a fix related to a problem like yours:
Looks like there's a transient exception that occur when your job is completed, and only logging could help to investigate the issue in detail. Please see this article to learn how to enable it, and feel free to post your log messages into this thread to conduct a further investigation. |
I am having this same issue on Hangfire 1.6.20 and using LiteDBStorage. I have seen several other reports of this issue but no resolution. Are you still using a work around? |
Same issue here. Hangfire 1.6.20 and Hangfire.SQLite 1.4.2
|
Hello. I have the same issue. The job starts for every worker. If I set the server worker count to 2 then it starts 2 times, if I set it to 50 then it starts 50 times. I use the latest Hangfire version (1.6.20) and SQLite for storage. The job is enqueued from the web application.
The server is started from another application (windows service).
Any ideas? |
I also tried with the LiteDB storage, same problem. I then tried with the in-memory storage and it works as expected. So it seems it's related to the storage. |
Hangfire 1.6.17.0 Tasks are created as: IState state = new EnqueuedState(QueueName.PRIORITY); Sometimes jobs run multiple times. Log from within the task: 2018-09-12 11:45:43.7292|INFO|44fa0359-e13c-4356-bdb9-9690df16eda0|Export calculation unit 2018-09-12 12:16:14.7399|INFO|44fa0359-e13c-4356-bdb9-9690df16eda0|Export calculation unit Job do access to database. So it could stuck for some time if all db connections from the pool are taken by other jobs. That what happened at the beginning I assume. Then the task was run again, but there were no reports on retry\error. And actually after that, task reported twice about successful completion (as well as about intermediate steps). Feels like the job was retried after some waiting period without canceling previous evaluation. |
Although your messages should be idempotent, this should definitely be fixed in Hangfire. Is this issue in Hangfire itself, or in the storage providers? |
Hi, |
I think I found the issue in the Sqlite provider: mobydi/Hangfire.Sqlite#2 (comment) Maybe this issue is something similar for the SQL Server provider as well? |
Experiencing the same issue. LiteDb as a storage. As a temporary solution I set WrokerCount to 1. |
In my case some workaround was to set extended intervals for MemoryStorage MemoryStorageOptions storageOpts = new MemoryStorageOptions()
{
JobExpirationCheckInterval = TimeSpan.FromMinutes(120),
FetchNextJobTimeout = TimeSpan.FromMinutes(120)
};
GlobalConfiguration.Configuration.UseMemoryStorage(storageOpts); But that works more or less till the task can be done in 2 hours time span. In my case I can be sure that at least most of the tasks will be accomplished |
Hey, it seems as if #1197 is about the same issue. Everybody who runs into this issue might want to check it out. |
I'm prototyping with Hangfire and MemoryStorage and seeing my job being executed multiple times. Something as simple as the following: _jobId = _jobClient.Enqueue<MyJobPerformer>(mjp => mjp.Perform(request)); public async Task Perform(RequestBase request)
{
await Task.Delay(TimeSpan.FromSeconds(5));
await Task.Delay(TimeSpan.FromSeconds(5));
await Task.Delay(TimeSpan.FromSeconds(5));
}
|
This is a very big issue. I've seen this with the SQLite and Postgres storage. I haven't seen this with the in in-memory provider, likely because it has distributed locking implemented properly. @odinserj, there likely needs to be clearer documentation to storage authors about how distributed locks should be implemented to prevent multiple tasks from being executed. |
@pauldotknopf |
Hmm, that seems like an easy repro. Considering that issue has been open since 2017, someone (not the maintainers) will likely have to debug/fix/contribute a PR. |
public class MyJobPerformer
{
private readonly string _performerId;
public MyJobPerformer()
{
_performerId = Guid.NewGuid().ToString("N");
}
public async Task Perform(RequestBase request)
{
Console.WriteLine($"{_performerId}: {DateTime.UtcNow}");
await Task.Delay(TimeSpan.FromSeconds(5));
Console.WriteLine($"{_performerId}: {DateTime.UtcNow}");
await Task.Delay(TimeSpan.FromSeconds(5));
Console.WriteLine($"{_performerId}: {DateTime.UtcNow}");
await Task.Delay(TimeSpan.FromSeconds(5));
Console.WriteLine($"{_performerId}: {DateTime.UtcNow}");
}
} Results
One job enqueue results in multiple worker executions. It's even worse if |
RecurringJob.AddOrUpdate(recurringJobId, () => EmailReceiveService.SendMail(parametes),cronExpression); |
Hey @odinserj, and update on this issue? |
+1. |
+1. |
Having same issue. OMG!!!!!! |
I use ASP.Net Core, 3.1 OMG! I am having this same issue. I observe:
So, putting 2 and 2 together, I felt that the reason for execution of job twice was because there are 2 servers. That means, I need to get rid of the second server with default options. I downloaded the HF code and placed breakpoints to analyze the flow. Here are my findings:
After this, I still had 2 servers, but both using my After debugging few times, I found this:
The servers are created by both ( Then I checked the documentation, and they did not specify to use both. I removed So, I tested the execution of a job. This time, it executed only once. Lessons learnt:
|
If you want this issue fixed, you will have to do it yourself. |
Well, I have placed some checks now in my JobWorker to avoid the multiple job (with same ID) execution requests. This hack helps. |
Indeed. The former uses As of processing the job multiple times, it clearly is your IIS misconfiguration. From the screenshot you can see all "processing" states have different server process IDs associated with them, so it appears the application is stopped and restarted periodically. IIS can do this if the site is not configured as "always running". Aside from that, jobs are supposed to be reentrant, so if some code is supposed to be executed once, it is up to you to track that. Or maybe introduce checkpoints by splitting your job into multiple jobs executed in sequence. See Also consider using cancellation tokens, so the job can be terminated gracefully when the server is stopped. |
@pieceofsummer My problem with the Hangfire documentation is that there are no clear sections that I can focus on OWIN/ASP.Net Core. You may disagree, but it is how I see and read it. Probably I am spolied by MSDN docs. So, your precise advice is of great help to me.
After reading Making ASP.NET Core application always running on IIS, and understanding the screenshots, I've configured the IIS. But no code changes have been done to ASP.Net Core app.
I've a simple function to execute, thus it cannot be split. This point, probably, is not for me.
Good point. But, I execute a console app via job, and I am happy that it does not get killed. But your point makes sense - to abort job when requested. Thanks again! |
I started implementing HangFire with SQLite Storage a couple of days ago and ran into the same problem: enqueued jobs were executed as many times as I had workers initialized (20 by default). I found the solution by changing SQLite storage with HangFire.LiteDB storage. In the release notes they specifically mention 'Fix Hangfire Job starts multiple times' so I thought I'd give it a try. It turns out that they indeed solved the problem; my jobs are finally getting executed only once through which I don't need hacky workarounds anymore. So, unless you really need SQLite as storage, I'd suggest switching to HangFire.LiteDb. Example code:
Cheers! |
Same problem for me, I think this is depend on server reset, when I recycle IIS app pool manually, there are created same jobs when app pool get started again, every time IIS app pool get restarted then duplicated jobs increased more and more. |
What storage are you using? If it's a community-based storage, then it's possible that FetchNextJob method wasn't implemented in an atomic way, and it is possible for multiple workers to pick up the same job. Please check the repository of the concrete storage implementation (you can find it there) and report the issue there. |
I am using Redis Storage like :
and Startup Configuration method:
the signature of method and it's override is like: override in derived class:
and Startup will call in WCF single instance service under IIS app pool. |
Ah I didn’t realize that those jobs are totally different because have
different identifiers. So there’s something that creates them and this
something is triggered once application is restarted, causing duplicates.
Пн, 5 окт. 2020 г. в 17:45, Ali.H <notifications@github.com>:
I am using Redis Storage like :
GlobalConfiguration.Configuration.UseRedisStorage("localhost",
new Hangfire.Pro.Redis.RedisStorageOptions()
{
InvisibilityTimeout = TimeSpan.MaxValue,
Database = 1,
Prefix = "hangfire:reclaim:",
}).UseConsole();
WebApp.Start<MARCO.Reclaim.Core.Startup>(address);
and Startup Configuration method:
appBuilder.UseWebApi(config);
appBuilder.UseHangfireDashboard("", new DashboardOptions());
appBuilder.UseHangfireServer(new BackgroundJobServerOptions
{
ServerName = $"sendbulk",
WorkerCount = 100,
Queues = new[] { "sendbulk" }
});
the signature of method and it's override is like:
public virtual void Do(string title, T order, PerformContext context,
IJobCancellationToken cancellationToken)
override in derived class:
[DisplayName("{0}")]
[Queue("sendbulk")]
[AutomaticRetry(Attempts = 5, DelaysInSeconds = new int[] { 60, 60 * 3, 60 * 3 * 3, 60 * 3 * 3 * 3, 60 * 3 * 3 * 3 * 3 })]
public override void Do(string title, SendBulkOrder order, PerformContext context, IJobCancellationToken cancellationToken)
and Startup will call in WCF single instance service under IIS app pool.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1025 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAIHLPQQCDKB7UFQJO5GOI3SJHLYFANCNFSM4D7TG45A>
.
--
Kind regards, Sergey Odinokov,
Founder/developer @ https://www.hangfire.io
|
Because almost each problem ends up either with "Enqueued jobs stuck" or with "Job starts multiple times", and different problems with different storages were reported into the same issue on GitHub. I'm really sorry you have so much troubles, please try to run everything with Hangfire.SqlServer, Hangfire.Pro.Redis or Hangfire.InMemory – these storages are supported in this repository, and other storages are supported by community in their own repositories. |
Sorry for my shitty reply. I was having a bad day at work, but I realize that I shouldn't bring my personal problems into these spaces. I'll just delete the old comment, it wasn't appropriate of me. Sorry again. |
no solution yet? |
I'm experiencing the same issue. Hangfire 1.7.9 with Hangfire.InMemory. A recurring task configured with Cron.Daily(0, 30). We specify our local timezone when invoking RecurringJob.AddOrUpdate. This job is a long-running task which runs until 5 am. It was started as expected at 00:30, and then started a second time at 01:00 the same night while the first instance was running. I have checked that the process didn't restart between 00:30 and 01:00. Update 16.11.2021: 2021-11-14 23:30:06 |
This can happen if you have multiple servers (apps) using the same storage. Configure Hangfire to use different database schema for each application. |
We have reproduced this problem on a single server configured to use the default memory storage. |
Do you flush storage each time you deploy app change? |
Flush the memory storage when the application changes? MemoryStorage is in-process and cleared when we restart the process after binary updates. |
Try using this filter to avoid scheduling a new recurring job execution when previous one is still running – https://gist.github.com/odinserj/a6ad7ba6686076c9b9b2e03fcf6bf74e. |
@dchrno : Did you fix this issue ? How ? We are facing a very similar (if not identical) issue with a recurring background job... |
Unfortunately, no. We ended up adding a mutex per task as a workaround. Edit: we have a base class containing this code. "key" is the type name of the recurring task.
|
@dchrno : Thanks for the feedback |
This issue still happens in version 1.7.25 and memory storage 1.4. Edit: I was using Hangfire.MemoryStorage which is NOT the ideal package (and it's creators also said it is not good for production purposes). I updated my main package to Hangfire.Core 1.8.5 and started using Hangfire.InMemory package from @odinserj . But anyways I had to implement a Mutex in C# to handle this multiple starts scenario. |
This is still an issue for version 1.7.28 using the MS SQL database as storage |
This is still an issue with the latest version as of today, paired with postregsql. In our case we have a machine learning task that can run for days. Those long running jobs start running again after 48 hours and it is always at midnight. Even if we fire the job lets say at noon 2 days ago, after two midnights pass, it starts again and runs concurrently, with its previous incarnation. We tried to remove the schedule and fire them from the dashboard by hand, it still behaves the same. |
I have gone through all the open issues here and found that the issue Im experiencing supposed to be solved with v1.5.8. But Im running v1.6.6 and still seeing the similar issue. So the same job will be processed multiple times randomly. I also saw issue #842 describing the same thing. Can someone help me to fix it?
I'm using Hangfire.SqlServer V1.6.6
The text was updated successfully, but these errors were encountered: