Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pgAgent "Job no steps to execute!" problem #25

Open
kocersema opened this issue Sep 20, 2022 · 7 comments
Open

pgAgent "Job no steps to execute!" problem #25

kocersema opened this issue Sep 20, 2022 · 7 comments

Comments

@kocersema
Copy link

Hello,
I use pgagent_14-4.2.1.
Jobs run every minute. Sometimes this error occurs but there are steps in pgagent.pga_jobstep.

Problem name: [2 / requestloginsert_minute] Job no steps to execute!

pgagent.pga_joblog :

pgagent_1

In the screenshot, 3 jobs start and end at the same time.

pgagent_2

How can I solve this problem?

Thank you.

@HalacliSeda
Copy link

Hello,
I am also getting the same error.
I use pgagent_14-4.2.1 and postgresql14-14.5.

I have a job that runs every minute.
Sometimes it works fine, but sometimes it gives an error that there is no step.

Here is my job step:

[postgres] # select * from pgagent.pga_jobstep where jstjobid='1';
jstid | jstjobid | jstname | jstdesc | jstenabled | jstkind | jstcode | jstconnstr | jstdbname | jstonerror | jscnextrun
-------+----------+---------+---------+------------+---------+--------------------------------------+------------+-----------+------------+------------
1 | 1 | steps | | t | s | CALL public.myfunction() | | postgres | f |
(1 row)

Here is job log:

[postgres] # select * from pgagent.pga_joblog order by jlgid desc limit 20;
jlgid | jlgjobid | jlgstatus | jlgstart | jlgduration
-------+----------+-----------+-------------------------------+-----------------
40069 | 1 | r | 2022-11-16 13:28:05.793086+02 |
40068 | 1 | s | 2022-11-16 13:27:03.447569+02 | 00:00:00.996283
40067 | 1 | s | 2022-11-16 13:26:04.197752+02 | 00:00:01.021981
40066 | 1 | s | 2022-11-16 13:25:05.454255+02 | 00:00:02.326982
40065 | 1 | s | 2022-11-16 13:24:04.219011+02 | 00:00:00.132577
40064 | 1 | s | 2022-11-16 13:23:02.819775+02 | 00:00:03.620179
40063 | 1 | s | 2022-11-16 13:22:04.076578+02 | 00:00:02.671583
40062 | 1 | s | 2022-11-16 13:21:04.036276+02 | 00:00:03.272034
40061 | 1 | s | 2022-11-16 13:20:02.968261+02 | 00:00:04.061977
40060 | 1 | s | 2022-11-16 13:19:04.412659+02 | 00:00:00.87664
40059 | 1 | s | 2022-11-16 13:18:03.230314+02 | 00:00:00.065278
40058 | 1 | s | 2022-11-16 13:17:04.748409+02 | 00:00:12.235386
40057 | 1 | i | 2022-11-16 13:16:08.327694+02 |
40056 | 1 | s | 2022-11-16 13:16:08.325891+02 | 00:00:00.1198
40055 | 1 | s | 2022-11-16 13:15:01.499926+02 | 00:00:00.087013
40054 | 1 | s | 2022-11-16 13:14:02.947203+02 | 00:00:04.059835
40053 | 1 | i | 2022-11-16 13:13:21.300041+02 |
40052 | 1 | i | 2022-11-16 13:13:21.19951+02 |
40051 | 1 | s | 2022-11-16 13:13:12.603287+02 | 00:00:16.804197
40050 | 1 | s | 2022-11-16 13:12:02.3888+02 | 00:00:01.504216

Could this be a bug?
Has anyone had the same problem and found a solution?

Thanks,
Seda

@akshay-joshi
Copy link
Contributor

Hi @kocersema and @HalacliSeda

I am not able to reproduce the issue on Windows 10, I have created a job that runs every minute and called the procedure as a part of the job step. I have tested two separate code
First:
BEGIN
PERFORM 1;
END;

Second:
BEGIN
PERFORM pg_sleep(80); //Sleep for 80 seconds
END;

In both the above cases, the issue was not reproducible. I have raised the PR 29 which contains some Debug/Warning logs to identify the issue.

@akshay-joshi
Copy link
Contributor

Hi @kocersema and @HalacliSeda

On which platform you are running pgAgent?

@HalacliSeda
Copy link

Hi @akshay-joshi ,
I got this error on Rocky 9. In my case, I run a procedure which is run 5-10 seconds max, and I run this job every minute.

@akshay-joshi
Copy link
Contributor

Hi @HalacliSeda

I have observed one more thing in your pga_joblog table. It seems that for some reason the jobs throwing an error 'No steps to execute' are scheduled at the same time or before a minute. It could be the reason pgAgent ignoring those steps. Please refer to the below screenshot:

pgagent

We have added the warning logs as part of the PR 29 to identify the exact issue.

@HalacliSeda
Copy link

Hi @akshay-joshi ,
yes, you're right. Success message came every time before giving this error. So is it possible to prevent this error that comes after the success message?

Thanks.

@bbourgier
Copy link

Hi
Reporting same behavior.
Success on scheduled time BUT successive logs just after with 'i' status.
This is not really a problem but after looking at the code I think something could be done to avoid those Job Run attempts falling down to 'i' exit status.

What I see is that there is a Job running loop at pgagent.cpp line 102 which relies on "jobnextrun <= now()" to try and run the Job.
But when the Job is run at the scheduled time, during the Job runtime (which can be long), the 'jobnextrun' column is updated ONLY when the Job run is finished. Which means that during the Job runtime, the condition is still fullfilled to try and run the Job in the loop.
For instance I have:
image
As you can see, 'joblastrun' is updated with the scheduled time (this is ok) but the 'jobnextrun' is still set to a time in the past.

And this is why you'll get in the job log table:
1 line with status 'r' while the Job is running
several lines with status 'i' created while the Job is running
And at last the line with status 'r' is updated with status 's' (on success of course)

@akshay-joshi Do you think it would be possible to avoid those 'i' lines by NOT trying and run the Job while it is already running. Either by updating the 'jobnextrun' column at the same moment you update the 'joblastrun' column, or by checking the Job status before trying to run it again?

Thanks in advance and sorry if not very clear in my explanations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants