Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximum hangfire job duration for long running jobs #1197

Open
rybama opened this issue Jun 5, 2018 · 19 comments
Open

Maximum hangfire job duration for long running jobs #1197

rybama opened this issue Jun 5, 2018 · 19 comments

Comments

@rybama
Copy link

rybama commented Jun 5, 2018

Hello,

In our project we have a few Hangfire jobs which take a lot time (possibly more than an hour). We noticed a strange behaviour for that jobs, only partial execution is complete (for example only some of the data is removed instead of everything). I was wondering if there's a default timeout on the time job is executing. If yes, could you direct me to the setting where this can be changed (either globally or for specific job), I couldn't find in the docs.

Thanks,
Marcin

@armandombi
Copy link

What version of Hangfire are you running and what storage option do you use? I keep having a similar issue with version 1.6.19 using MongoDB and my long running background job keeps getting canceled after 30 minutes approximately, might be the same issue

@BrightSoul
Copy link

BrightSoul commented Aug 7, 2018

I also encountered a similar issue. After exactly 30 minutes, a second thread comes in and starts executing the same job. If I just ignore the cancellation token, the first thread keeps going. I'm using the In-memory storage.

Let's make an example. I have this job that's supposed to run for 2 hours. It appends a line of text to a file every 3 seconds.

public void Run(PerformContext context, IJobCancellationToken cancellationToken) {
  var startDate = DateTime.Now;
  var endDate = startDate.AddHours(2);
  while(DateTime.Now < endDate) {
      Thread.Sleep(3000);
      File.AppendAllText("output.txt", $"\r\n{DateTime.Now}\t{Thread.CurrentThread.ManagedThreadId}\t{GetHashCode()}");
  }
}

When it's run, these lines of text will appear in the text file. First column is the timestamp, second is the thread id, third is the hash code of the job instance.

07/08/2018 19:58:25	15	61515057
07/08/2018 19:58:28	15	61515057
07/08/2018 19:58:31	15	61515057
07/08/2018 19:58:34	15	61515057
07/08/2018 19:58:37	15	61515057
07/08/2018 19:58:40	15	61515057
07/08/2018 19:58:43	15	61515057

As I said, after exactly 30 minutes, a second thread executes the same job. I can tell since I see this in the text file.

07/08/2018 20:28:26	15	61515057
07/08/2018 20:28:26	16	49538252
07/08/2018 20:28:29	15	61515057
07/08/2018 20:28:29	16	49538252
07/08/2018 20:28:32	15	61515057
07/08/2018 20:28:32	16	49538252

As you can see, there's now two threads executing two instances of my job (and this is a problem in my case).
How to solve this?

@fraser-lowndes
Copy link

I am experiencing the same issue, but struggling to reproduce it locally. I will try @BrightSoul's reproduction. Here is a screenshot of what I am experiencing (server name removed for confidentiality). You can see sometimes it is not exactly 30 minutes. I believe this is because Hangfire tries to recruit another worker thread but they are all busy processing other jobs and it must wait for one to become free.

reset-job

@BrightSoul
Copy link

I managed to solve my problem with this code (it's an ASP.NET Core application).

services.AddHangfire(config => {
                config.UseMemoryStorage(new MemoryStorageOptions { FetchNextJobTimeout = TimeSpan.FromHours(24) });
            });

By setting FetchNextJobTimeout, I can control how long a job can run for before Hangfire starts executing it again on another thread.
I wish this behaviour was configurable with its own Hangfire option instead of relying on the storage provider. I mean, why is the job executed again in the first place? It's obviously running, it doesn't need to be run again.

@Safirion
Copy link

Same issue with MySqlStorage. Thank for this post for the information.
This default beaviour is very anoying and just kill the RAM of my server... (I have a task that during 3 days... And hangfire duplicate this task every 30min until I kill the process... 2days later... X_x )

@LucasFarley
Copy link

I have the same problem in Hangfire 1.6.21 with MySqlStorage

@dhnnjy
Copy link

dhnnjy commented May 2, 2019

I have the same issue in Hangfire 1.7.1 with Hangfire.PostgreSql 1.5.0.
While the jobs with short duration work ok, the job with longer duration, say 1 hour or more, start again at every 30 mins.
I'm using 'BackgroundJob.Enqueue(expression)' to 'fire and forget' the jobs. These jobs are not expected to run multiple times.

@Safirion
Copy link

Safirion commented May 2, 2019

I think that the implementation of MySqlStorage and Hangfire.PostgreSql are wrong because on SqlServer, I haven't this issue without configuring anithing.

With MySqlStorage, I finaly use the 'InvisibilityTimeout'... even if it's deprecated... it's working :(

@stevendesrochers
Copy link

stevendesrochers commented May 31, 2019

@dhnnjy In Hangfire.PostgreSql, you have to change the InvisibilityTimeout (which is 30 minutes by default) to higher if you want you job to not timeout and get processed by another worker.

services.AddHangfire((isp, config) =>
{
   config.UsePostgreSqlStorage(configuration.HangfirePostgresConnectionString, 
       new PostgreSqlStorageOptions()
       {
           //change this
           InvisibilityTimeout = TimeSpan.FromHours(3) 
       });
});

@engilas
Copy link

engilas commented Jul 30, 2019

@dhnnjy In Hangfire.PostgreSql, you have to change the InvisibilityTimeout (which is 30 minutes by default) to higher if you want you job to not timeout and get processed by another worker.

services.AddHangfire((isp, config) =>
{
   config.UsePostgreSqlStorage(configuration.HangfirePostgresConnectionString, 
       new PostgreSqlStorageOptions()
       {
           //change this
           InvisibilityTimeout = TimeSpan.FromHours(3) 
       });
});

I'm using Hangfire.Mongo, this works for me too

@natalie-o-perret
Copy link

@dhnnjy In Hangfire.PostgreSql, you have to change the InvisibilityTimeout (which is 30 minutes by default) to higher if you want you job to not timeout and get processed by another worker.

services.AddHangfire((isp, config) =>
{
   config.UsePostgreSqlStorage(configuration.HangfirePostgresConnectionString, 
       new PostgreSqlStorageOptions()
       {
           //change this
           InvisibilityTimeout = TimeSpan.FromHours(3) 
       });
});

@stevendesrochers even now? InvisibilityTimeout is supposedly deprecated for quite a while =/

@stevendesrochers
Copy link

@ehouarn-perret InvisibilityTimeout has been deprecated in Hangfire.SqlServer because of the new SlidingInvisibilityTimeout property.

From what i can tell SlidingInvisibilityTimeout hasn't yet been ported to Hangfire.PostgreSql, thus not deprecating the old InvisibilityTimeout.

Maybe we'll see this in a future release of Hangfire.PostgreSql

@akhansari
Copy link

akhansari commented Sep 5, 2019

@stevendesrochers I'm a little confused.
Why we can't use DisableConcurrentExecution instead of InvisibilityTimeout for long running jobs ?!

@markalanevans
Copy link

So what about for Hangfire Pro w/ redis? How can i configure specific jobs to not be auto restarted? Some jobs just take 2 hours to finish.

@zakayohaule
Copy link

Same issue with MySqlStorage. Thank for this post for the information.
This default beaviour is very anoying and just kill the RAM of my server... (I have a task that during 3 days... And hangfire duplicate this task every 30min until I kill the process... 2days later... X_x )

Hey man,.how did you get MysqlStorage to work in your ASP.NET CORE app,. mine fails to create some tables, such hangfire_state and hangfire_set? Can you help a brother out, please?

@megasuperlexa
Copy link

megasuperlexa commented Sep 7, 2020

@stevendesrochers I'm a little confused.
Why we can't use DisableConcurrentExecution instead of InvisibilityTimeout for long running jobs ?!

this is different. InvisibilityTimeout causes Hangfire to cancel and then restart the job. DisableConcurrentExecution just keeps 1 job instance despite attempts to queue more. (but it still wil be cancelled after InvisibilityTimeout)

@billpeace
Copy link

Please try using the SkipWhenPreviousJobIsRunningAttribute job filter from the referenced gist
https:// gist.github.com/odinserj/a6ad7ba6686076c9b9b2e03fcf6bf74e

@aradalvand
Copy link

aradalvand commented Aug 23, 2023

This was an absolute nightmare to debug; I had no idea why some of my jobs were behaving strangely and being re-spawned for no apparent reason.
This makes no sense as a default behavior.

@billpeace
Copy link

I found that it may be related to Times out of iis,

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests