[BUG] - Panic when scheduler is about to reschedule a daily job #696

mhyllander · 2024-03-19T10:32:55Z

Describe the bug

Two times now a job that runs once daily has caused a panic. When this happens it seems that the job has not been executed. It should update a database and print some logs, but this does not happen. However, the panic occurs on a line that indicates the job has returned from the executor and is about to be rescheduled.

To Reproduce

I don't have a reliable way of reproducing the problem. Most of the time the job runs as expected, but two times a panic has occurred in the gocron code. The panics have occurred two times with 9 days in between, so they happen only sporadically.

Basically, a job is scheduled by this function:

func scheduleTimerJob(jobName, jobsetGen string, power uint, t codec.Time) (gocron.Job, error) {
	j, err := scheduler.NewJob(
		gocron.DailyJob(1, gocron.NewAtTimes(gocron.NewAtTime(t.Hour(), t.Minute(), 0))),
		gocron.NewTask(
			RunTimerJob,
			power,
			jobName,
		),
		gocron.WithName(jobName),
		gocron.WithTags(timerJobCategory, jobsetGen),
	)
	if err != nil {
		return nil, err
	}
	return j, nil
}

The scheduler is started like this:

	// create a scheduler
	scheduler, err = gocron.NewScheduler(
		gocron.WithLogger(slog.Default()),
		gocron.WithLimitConcurrentJobs(1, gocron.LimitModeWait),
	)
	if err != nil {
		return err
	}

	// start the scheduler
	scheduler.Start()

Version

2.2.5

I haven't tried the new 2.2.6, but it can take some time to see if there's any improvement.

Expected behavior

No panic.

Additional context

Stack traces:

Mar 09 23:00:00 piir paninv_controller[3149]: panic: runtime error: invalid memory address or nil pointer dereference
Mar 09 23:00:00 piir paninv_controller[3149]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x6af984]
Mar 09 23:00:00 piir paninv_controller[3149]: goroutine 23 [running]:
Mar 09 23:00:00 piir paninv_controller[3149]: github.com/go-co-op/gocron/v2.(*scheduler).selectExecJobIDsOut(0x400032c000, {0xf5, 0x78, 0xd9, 0x4d, 0x5f, 0xbf, 0x42, 0xe2, 0x99, ...})
Mar 09 23:00:00 piir paninv_controller[3149]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:311 +0x154
Mar 09 23:00:00 piir paninv_controller[3149]: github.com/go-co-op/gocron/v2.NewScheduler.func1()
Mar 09 23:00:00 piir paninv_controller[3149]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:151 +0x250
Mar 09 23:00:00 piir paninv_controller[3149]: created by github.com/go-co-op/gocron/v2.NewScheduler in goroutine 1
Mar 09 23:00:00 piir paninv_controller[3149]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:146 +0x48c

Mar 18 22:00:00 piir paninv_controller[11965]: panic: runtime error: invalid memory address or nil pointer dereference
Mar 18 22:00:00 piir paninv_controller[11965]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x6afcb4]
Mar 18 22:00:00 piir paninv_controller[11965]: goroutine 23 [running]:
Mar 18 22:00:00 piir paninv_controller[11965]: github.com/go-co-op/gocron/v2.(*scheduler).selectExecJobIDsOut(0x40001c0160, {0xbe, 0xe4, 0x35, 0xb3, 0x4e, 0x60, 0x48, 0x89, 0xa4, ...})
Mar 18 22:00:00 piir paninv_controller[11965]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:311 +0x154
Mar 18 22:00:00 piir paninv_controller[11965]: github.com/go-co-op/gocron/v2.NewScheduler.func1()
Mar 18 22:00:00 piir paninv_controller[11965]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:151 +0x250
Mar 18 22:00:00 piir paninv_controller[11965]: created by github.com/go-co-op/gocron/v2.NewScheduler in goroutine 1
Mar 18 22:00:00 piir paninv_controller[11965]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.5/scheduler.go:146 +0x48c

The text was updated successfully, but these errors were encountered:

chih-ming-lin · 2024-03-22T04:08:10Z

I have the same problem in 2.2.6.

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xc6ecae]

goroutine 814426 [running]:
github.com/go-co-op/gocron/v2.(*scheduler).selectExecJobIDsOut(0xc00f528580, {0xb, 0x2f, 0x84, 0x8, 0xf6, 0xdb, 0x42, 0x9d, 0x82, ...})
	github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:311 +0x16e
github.com/go-co-op/gocron/v2.NewScheduler.func1()
	github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:151 +0x3a5
created by github.com/go-co-op/gocron/v2.NewScheduler in goroutine 814419
	github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:146 +0x598

I wrote a case to reproduce it.

func TestScheduler_RemoveJobs(t *testing.T) {
	s := newTestScheduler(t)
	s.Start()

	s.NewJob(
		DurationJob(500*time.Millisecond),
		NewTask(func() {}),
		WithEventListeners(
			BeforeJobRuns(
				func(jobID uuid.UUID, jobName string) {
					s.RemoveByTags("tag1")
				},
			),
		),
		WithTags("tag1"),
	)

	// wait job running
	time.Sleep(10 * time.Second)
}

JohnRoesler · 2024-03-22T18:57:15Z

Hey @mhyllander thanks for reporting. Are you doing any job removal like the example @chih-ming-lin posted? The text does reliably reproduce the issue and I can patch the bug it highlights.

I'm just trying to understand if that is the same issue you are having or not? The issue there is that the job is being removed during the job run itself and i had a map lookup that didn't check if the value was present first.

mhyllander · 2024-03-24T08:53:24Z

@JohnRoesler Thanks for looking into this. The timer job does not remove any jobs itself, but I have other jobs that can remove and add timer jobs. Existing timer jobs are removed using RemoveByTags. However, that happens a few hours before the timer job runs, so when the timer job runs, no other jobs are running and no jobs are being removed.

mhyllander · 2024-03-24T09:13:37Z

You can see my code to reschedule timer jobs here. Due to the way jobs are removed asynchronously, I generate a new unique tag for each "generation" of the timer jobs, so that I can remove the old generation and schedule a new generation. Before using these unique tags, I had problems where the new timer jobs would be removed right after they were created.

mhyllander · 2024-03-25T08:28:31Z

@JohnRoesler It happened again yesterday, using 2.2.6:

Mar 24 20:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T20:00:00.000632435+01:00","level":"INFO","msg":"running settings job","jobName":"arbetsveckor_184 0 20 * * 0-4"}
Mar 24 20:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T20:00:00.002759507+01:00","level":"INFO","msg":"sending config","power":"on","mode":"heat","powerful":"off","quiet":"off","temp":"22","fan":"high","vert":"high","horiz":"auto","ton":"on","tont":"06:20","toff":"on","tofft":"22:00>
Mar 24 20:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T20:00:00.041236965+01:00","level":"INFO","msg":"scheduled timer_on_1 job","jobsetGen":"//timer/timer##3","at":"05:35"}
Mar 24 20:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T20:00:00.0424205+01:00","level":"INFO","msg":"scheduled timer_on_2 job","jobsetGen":"//timer/timer##3","at":"06:20"}
Mar 24 20:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T20:00:00.042693624+01:00","level":"INFO","msg":"scheduled timer_off job","jobsetGen":"//timer/timer##3","at":"22:00"}
Mar 24 22:00:00 piir paninv_controller[22377]: {"time":"2024-03-24T22:00:00.114151786+01:00","level":"INFO","msg":"running timer job","jobName":"timer_off","power":0}
Mar 24 23:00:00 piir paninv_controller[22377]: panic: runtime error: invalid memory address or nil pointer dereference
Mar 24 23:00:00 piir paninv_controller[22377]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x6b0394]
Mar 24 23:00:00 piir paninv_controller[22377]: goroutine 23 [running]:
Mar 24 23:00:00 piir paninv_controller[22377]: github.com/go-co-op/gocron/v2.(*scheduler).selectExecJobIDsOut(0x40001c8000, {0x64, 0x9e, 0xe1, 0xcd, 0xd, 0xb9, 0x41, 0x63, 0xb5, ...})
Mar 24 23:00:00 piir paninv_controller[22377]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:311 +0x154
Mar 24 23:00:00 piir paninv_controller[22377]: github.com/go-co-op/gocron/v2.NewScheduler.func1()
Mar 24 23:00:00 piir paninv_controller[22377]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:151 +0x250
Mar 24 23:00:00 piir paninv_controller[22377]: created by github.com/go-co-op/gocron/v2.NewScheduler in goroutine 1
Mar 24 23:00:00 piir paninv_controller[22377]:         /home/mhy/go/pkg/mod/github.com/go-co-op/gocron/v2@v2.2.6/scheduler.go:146 +0x48c
Mar 24 23:00:00 piir systemd[1]: paninv_controller.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 24 23:00:00 piir systemd[1]: paninv_controller.service: Failed with result 'exit-code'.

The interesting thing is that at 20:00, the timer_off job at 23:00 was removed, and a new timer_off job was scheduled at 22:00. The new timer_off job ran as it should at 22:00, but then at 23:00, when the timer_off job that was removed would have run, a panic occurred. It seems that the job that was removed was still scheduled to run.

JohnRoesler · 2024-03-25T20:52:28Z

Thanks for confirming. I found and fixed that bug. I'll get it released.

JohnRoesler · 2024-03-25T20:53:35Z

v2.2.7

mhyllander added the bug Something isn't working label Mar 19, 2024

JohnRoesler mentioned this issue Mar 22, 2024

fix case where job removed causes panic when rescheduling #698

Merged

2 tasks

JohnRoesler closed this as completed Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] - Panic when scheduler is about to reschedule a daily job #696

[BUG] - Panic when scheduler is about to reschedule a daily job #696

mhyllander commented Mar 19, 2024

chih-ming-lin commented Mar 22, 2024 •

edited

Loading

JohnRoesler commented Mar 22, 2024

mhyllander commented Mar 24, 2024

mhyllander commented Mar 24, 2024

mhyllander commented Mar 25, 2024

JohnRoesler commented Mar 25, 2024

JohnRoesler commented Mar 25, 2024

[BUG] - Panic when scheduler is about to reschedule a daily job #696

[BUG] - Panic when scheduler is about to reschedule a daily job #696

Comments

mhyllander commented Mar 19, 2024

Describe the bug

To Reproduce

Version

Expected behavior

Additional context

chih-ming-lin commented Mar 22, 2024 • edited Loading

JohnRoesler commented Mar 22, 2024

mhyllander commented Mar 24, 2024

mhyllander commented Mar 24, 2024

mhyllander commented Mar 25, 2024

JohnRoesler commented Mar 25, 2024

JohnRoesler commented Mar 25, 2024

chih-ming-lin commented Mar 22, 2024 •

edited

Loading