-
Notifications
You must be signed in to change notification settings - Fork 189
Description
Hi all,
We have been testing a deployment in our application that pushes the scale of the workflow system by running a deployment that would produce 100x iterations for group of steps(~5 steps).
- Each iteration is executed sequentially, all steps are external API calls.
- So we have 1 workflow with 5 steps that execute external API calls. One of the steps long running for ~30 minutes. 5 steps are in a loop based on user input of qty.
- So we executed 100x > 1x workflow (5 steps * 100), 500 steps in Vercel World. Run has been going for 2+ hours.
This produced 1000 events in workflow as its going through its iteration. Once it reached 1000 events in workflow, it completly broke and hang up. We are seeing the following errors in Vercel logs. The run is in progress state but only produces these errors.
Queue callback error: Error [WorkflowAPIError]: workflow step step_01KH7ATNEGDTGCRZP97F42ZYEZ not found at ignore-listed frames { cause: undefined, status: 404, code: undefined, url: 'https://vercel-workflow.com/api/v1/runs/wrun_01KH7ATFXVM4Z6WWZG57EYJXN7/steps/step_01KH7ATNEGDTGCRZP97F42ZYEZ?remoteRefBehavior=resolve' }
and
Error: Failed to find Server Action "009b8da430e8bc0ffef7dfbbe7a31524830b1a36a2". This request might be from an older or newer deployment. Read more: https://nextjs.org/docs/messages/failed-to-find-server-action at ignore-listed frames
We also noticed increasing number of Vercel Runtime Timeout Error: Task timed out after 800 seconds, which does not affect the workflow but causes stream to hang up sometimes to UI communicating progress updates.
This is real scenario where we would run this in production every day. What is the recommended approach to implement Vercel workflows for such large run( 5 steps of external api calls, long running, that need ot be done in groups of 100-200s daily)?