Skip to content

Commit a89d544

Browse files
authored
fix: ensure jobs do not retry indefinitely by default, fix undefined values in error messages (#9605)
## Fix default retries By default, if no `retries` property has been set, jobs / tasks should not be retried. This was not the case previously, as the `maxRetries` variable was `undefined`, causing jobs to retry endlessly. This PR sets them to `0` by default. Additionally, this fixes some undesirable behavior of the workflow retries property. Workflow retries now act as **maximum**, workflow-level retries. Only tasks that do not have a retry property set will inherit the workflow-level retries. ## Fix error messages Previously, you were able to encounter error messages with undefined values like these: ![CleanShot 2024-11-28 at 15 23 37@2x](https://github.com/user-attachments/assets/81617ca8-11de-4d35-b9bf-cc6c5bc515be) Reason is that it was always using `job.workflowSlug` for the error messages. However, if you queue a task directly, without a workflow, `job.workflowSlug` is undefined and `job.taskSlug` should be used instead. This PR then gets rid of the second undefined value by ensuring that `maxRetries´ is never undefined
1 parent e4c3c5b commit a89d544

File tree

12 files changed

+722
-119
lines changed

12 files changed

+722
-119
lines changed

docs/jobs-queue/tasks.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Simply add a task to the `jobs.tasks` array in your Payload config. A task consi
3030
| `label` | Define a human-friendly label for this task. |
3131
| `onFail` | Function to be executed if the task fails. |
3232
| `onSuccess` | Function to be executed if the task succeeds. |
33-
| `retries` | Specify the number of times that this step should be retried if it fails. |
33+
| `retries` | Specify the number of times that this step should be retried if it fails. If this is undefined, the task will either inherit the retries from the workflow or have no retries. If this is 0, the task will not be retried. By default, this is undefined. |
3434

3535
The logic for the Task is defined in the `handler` - which can be defined as a function, or a path to a function. The `handler` will run once a worker picks picks up a Job that includes this task.
3636

docs/jobs-queue/workflows.mdx

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: Workflows
33
label: Workflows
44
order: 30
5-
desc: A Task is a distinct function declaration that can be run within Payload's Jobs Queue.
5+
desc: A Task is a distinct function declaration that can be run within Payload's Jobs Queue.
66
keywords: jobs queue, application framework, typescript, node, react, nextjs
77
---
88

@@ -30,6 +30,7 @@ To define a JS-based workflow, simply add a workflow to the `jobs.wokflows` arra
3030
| `interfaceName` | You can use interfaceName to change the name of the interface that is generated for this workflow. By default, this is "Workflow" + the capitalized workflow slug. |
3131
| `label` | Define a human-friendly label for this workflow. |
3232
| `queue` | Optionally, define the queue name that this workflow should be tied to. Defaults to "default". |
33+
| `retries` | You can define `retries` on the workflow level, which will enforce that the workflow can only fail up to that number of retries. If a task does not have retries specified, it will inherit the retry count as specified on the workflow. You can specify `0` as `workflow` retries, which will disregard all `task` retry specifications and fail the entire workflow on any task failure. You can leave `workflow` retries as undefined, in which case, the workflow will respect what each task dictates as their own retry count. By default this is undefined, meaning workflows retries are defined by their tasks |
3334

3435
Example:
3536

packages/payload/src/queues/config/types/taskTypes.ts

Lines changed: 21 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,14 @@ export type TaskHandlerResults = {
6060
// Helper type to create correct argument type for the function corresponding to each task.
6161
export type RunTaskFunctionArgs<TTaskSlug extends keyof TypedJobs['tasks']> = {
6262
input?: TaskInput<TTaskSlug>
63-
retries?: number | RetryConfig
63+
/**
64+
* Specify the number of times that this task should be retried if it fails for any reason.
65+
* If this is undefined, the task will either inherit the retries from the workflow or have no retries.
66+
* If this is 0, the task will not be retried.
67+
*
68+
* @default By default, tasks are not retried and `retries` is `undefined`.
69+
*/
70+
retries?: number | RetryConfig | undefined
6471
}
6572

6673
export type RunTaskFunction<TTaskSlug extends keyof TypedJobs['tasks']> = (
@@ -76,7 +83,14 @@ export type RunInlineTaskFunction = <TTaskInput extends object, TTaskOutput exte
7683
taskID: string,
7784
taskArgs: {
7885
input?: TTaskInput
79-
retries?: number | RetryConfig
86+
/**
87+
* Specify the number of times that this task should be retried if it fails for any reason.
88+
* If this is undefined, the task will either inherit the retries from the workflow or have no retries.
89+
* If this is 0, the task will not be retried.
90+
*
91+
* @default By default, tasks are not retried and `retries` is `undefined`.
92+
*/
93+
retries?: number | RetryConfig | undefined
8094
// This is the same as TaskHandler, but typed out explicitly in order to improve type inference
8195
task: (args: { input: TTaskInput; job: RunningJob<any>; req: PayloadRequest }) =>
8296
| {
@@ -162,8 +176,12 @@ export type TaskConfig<
162176
outputSchema?: Field[]
163177
/**
164178
* Specify the number of times that this step should be retried if it fails.
179+
* If this is undefined, the task will either inherit the retries from the workflow or have no retries.
180+
* If this is 0, the task will not be retried.
181+
*
182+
* @default By default, tasks are not retried and `retries` is `undefined`.
165183
*/
166-
retries?: number | RetryConfig
184+
retries?: number | RetryConfig | undefined
167185
/**
168186
* Define a slug-based name for this job. This slug needs to be unique among both tasks and workflows.
169187
*/

packages/payload/src/queues/config/types/workflowJSONTypes.ts

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@ export type WorkflowStep<
1414
* Each task needs to have a unique ID to track its status
1515
*/
1616
id: string
17+
/**
18+
* Specify the number of times that this workflow should be retried if it fails for any reason.
19+
*
20+
* @default By default, workflows are not retried and `retries` is `0`.
21+
*/
1722
retries?: number | RetryConfig
1823
} & (
1924
| {

packages/payload/src/queues/config/types/workflowTypes.ts

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -114,9 +114,14 @@ export type WorkflowConfig<TWorkflowSlugOrInput extends keyof TypedJobs['workflo
114114
*/
115115
queue?: string
116116
/**
117-
* Specify the number of times that this workflow should be retried if it fails for any reason.
117+
* You can define `retries` on the workflow level, which will enforce that the workflow can only fail up to that number of retries. If a task does not have retries specified, it will inherit the retry count as specified on the workflow.
118+
*
119+
* You can specify `0` as `workflow` retries, which will disregard all `task` retry specifications and fail the entire workflow on any task failure.
120+
* You can leave `workflow` retries as undefined, in which case, the workflow will respect what each task dictates as their own retry count.
121+
*
122+
* @default undefined. By default, workflows retries are defined by their tasks
118123
*/
119-
retries?: number | RetryConfig
124+
retries?: number | RetryConfig | undefined
120125
/**
121126
* Define a slug-based name for this job.
122127
*/

packages/payload/src/queues/operations/runJobs/index.ts

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,8 @@ export const runJobs = async ({
171171
workflowHandler = await importHandlerPath<typeof workflowHandler>(workflowConfig.handler)
172172

173173
if (!workflowHandler) {
174-
const errorMessage = `Can't find runner while importing with the path ${workflowConfig.handler} in job type ${job.workflowSlug}.`
174+
const jobLabel = job.workflowSlug || `Task: ${job.taskSlug}`
175+
const errorMessage = `Can't find runner while importing with the path ${workflowConfig.handler} in job type ${jobLabel}.`
175176
req.payload.logger.error(errorMessage)
176177

177178
await updateJob({

packages/payload/src/queues/operations/runJobs/runJob/getRunTaskFunction.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ export async function handleTaskFailed({
107107
}
108108
}
109109

110-
if (taskStatus && !taskStatus.complete && taskStatus.totalTried >= maxRetries) {
110+
if (!taskStatus?.complete && (taskStatus?.totalTried ?? 0) >= maxRetries) {
111111
state.reachedMaxRetries = true
112112

113113
await updateJob({
@@ -182,9 +182,21 @@ export const getRunTaskFunction = <TIsInline extends boolean>(
182182
throw new Error(`Task ${taskSlug} not found in workflow ${job.workflowSlug}`)
183183
}
184184
}
185-
const maxRetries: number =
185+
let maxRetries: number =
186186
typeof retriesConfig === 'object' ? retriesConfig?.attempts : retriesConfig
187187

188+
if (maxRetries === undefined || maxRetries === null) {
189+
// Inherit retries from workflow config, if they are undefined and the workflow config has retries configured
190+
if (workflowConfig.retries !== undefined && workflowConfig.retries !== null) {
191+
maxRetries =
192+
typeof workflowConfig.retries === 'object'
193+
? workflowConfig.retries.attempts
194+
: workflowConfig.retries
195+
} else {
196+
maxRetries = 0
197+
}
198+
}
199+
188200
const taskStatus: null | SingleTaskStatus<string> = job?.taskStatus?.[taskSlug]
189201
? job.taskStatus[taskSlug][taskID]
190202
: null

packages/payload/src/queues/operations/runJobs/runJob/handleWorkflowError.ts

Lines changed: 26 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,25 @@ export function handleWorkflowError({
2323
}): {
2424
hasFinalError: boolean
2525
} {
26+
const jobLabel = job.workflowSlug || `Task: ${job.taskSlug}`
27+
2628
let hasFinalError = state.reachedMaxRetries // If any TASK reached max retries, the job has an error
27-
const maxRetries =
28-
typeof workflowConfig.retries === 'object'
29+
const maxWorkflowRetries: number =
30+
(typeof workflowConfig.retries === 'object'
2931
? workflowConfig.retries.attempts
30-
: workflowConfig.retries
32+
: workflowConfig.retries) ?? undefined
33+
34+
if (
35+
maxWorkflowRetries !== undefined &&
36+
maxWorkflowRetries !== null &&
37+
job.totalTried >= maxWorkflowRetries
38+
) {
39+
hasFinalError = true
40+
state.reachedMaxRetries = true
41+
}
42+
3143
// Now let's handle workflow retries
32-
if (!hasFinalError && workflowConfig.retries) {
44+
if (!hasFinalError) {
3345
if (job.waitUntil) {
3446
// Check if waitUntil is in the past
3547
const waitUntil = new Date(job.waitUntil)
@@ -38,26 +50,22 @@ export function handleWorkflowError({
3850
delete job.waitUntil
3951
}
4052
}
41-
if (job.totalTried >= maxRetries) {
42-
state.reachedMaxRetries = true
43-
hasFinalError = true
44-
} else {
45-
// Job will retry. Let's determine when!
46-
const waitUntil: Date = calculateBackoffWaitUntil({
47-
retriesConfig: workflowConfig.retries,
48-
totalTried: job.totalTried ?? 0,
49-
})
5053

51-
// Update job's waitUntil only if this waitUntil is later than the current one
52-
if (!job.waitUntil || waitUntil > new Date(job.waitUntil)) {
53-
job.waitUntil = waitUntil.toISOString()
54-
}
54+
// Job will retry. Let's determine when!
55+
const waitUntil: Date = calculateBackoffWaitUntil({
56+
retriesConfig: workflowConfig.retries,
57+
totalTried: job.totalTried ?? 0,
58+
})
59+
60+
// Update job's waitUntil only if this waitUntil is later than the current one
61+
if (!job.waitUntil || waitUntil > new Date(job.waitUntil)) {
62+
job.waitUntil = waitUntil.toISOString()
5563
}
5664
}
5765

5866
req.payload.logger.error({
5967
err: error,
60-
msg: `Error running job ${job.workflowSlug} ${job.taskSlug} id: ${job.id} attempt ${job.totalTried}/${maxRetries}`,
68+
msg: `Error running job ${jobLabel} id: ${job.id} attempt ${job.totalTried + 1}${maxWorkflowRetries !== undefined ? '/' + (maxWorkflowRetries + 1) : ''}`,
6169
})
6270

6371
return {

0 commit comments

Comments
 (0)