Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random crashes due to possible bug in WaitTracker.ts #6624

Closed
nmf2 opened this issue Jul 7, 2023 · 19 comments
Closed

Random crashes due to possible bug in WaitTracker.ts #6624

nmf2 opened this issue Jul 7, 2023 · 19 comments
Labels

Comments

@nmf2
Copy link

nmf2 commented Jul 7, 2023

Describe the bug
Our self-hosted main n8n instance randomly crashes with the following error trace:

2023-07-07T18:24:33.059Z | debug    | Wait tracker found 2 executions. Setting timer for IDs: 4692368, 4692374 "{ file: 'WaitTracker.js', function: 'getWaitingExecutions' }"

/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:82
                const triggerTime = execution.waitTill.getTime() - new Date().getTime();
                                                       ^
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:82:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)

To Reproduce
Steps to reproduce the behavior:
No idea yet... it crashes randomly with that error above.

Environment (please complete the following information):

  • OS: Using official n8n docker image n8nio/n8n:0.233.1
  • n8n Version Using official n8n docker image n8nio/n8n:0.233.1
  • Node.js Version: Using official n8n docker image n8nio/n8n:0.233.1
  • Database system Postgres
  • Operation mode queue, 3 workers, 1 main instance.

Additional context
I tracked the error to this line of n8n's code:

const triggerTime = execution.waitTill!.getTime() - new Date().getTime();

I'm not sure what else to provide you with to help solve the issue so feel free to ask!

@netroy
Copy link
Member

netroy commented Jul 7, 2023

@nmf2 Do you have (or could you please create) a small workflow to reproduce this?
That'd help a lot with fixing this quickly.

@nmf2
Copy link
Author

nmf2 commented Jul 7, 2023

@netroy, thanks for the quick response!

Sincerely I wish I could provide a workflow but this is not really something that I even know how to reproduce. It just happens randomly and it doesn't affect the executions of a specific workflow. I do think it has to do with the wait node but only because of where the error happens in the code...

[Edit]:
Given how the following code is written, the waitTill property should always have the getTime() function but that's not happening in reality:

const triggerTime = execution.waitTill!.getTime() - new Date().getTime();

@netroy
Copy link
Member

netroy commented Jul 8, 2023

@nmf2 are you using the webhook option in the wait node ?

@nmf2
Copy link
Author

nmf2 commented Jul 9, 2023

@netroy we don't use that in any of our workflows, but we do use the Wait node with a long delay (days), so the execution goes into the waiting status.

Hope this helps.

How can I be of further assistance?

@DRIMOL
Copy link

DRIMOL commented Jul 26, 2023

this happens to me, also only if the wait has more than 65 seconds, I'm currently in version 1.0.5 but, I believe that in version 0 it was also happening, I'll leave a simple flow here so you can test it.

{
  "meta": {
    "instanceId": "33738330930e3881dd5571eca013f36ddf8aab20e4ea5c1f2ebaf4a2b4668ac6"
  },
  "nodes": [
    {
      "parameters": {
        "values": {
          "string": [
            {}
          ]
        },
        "options": {}
      },
      "id": "9f467564-55b1-44c3-93d2-a7aec275a713",
      "name": "Set",
      "type": "n8n-nodes-base.set",
      "typeVersion": 2,
      "position": [
        340,
        400
      ]
    },
    {
      "parameters": {
        "amount": 10,
        "unit": "seconds"
      },
      "id": "ea2b1005-2a2c-4866-abd4-2cafa6893ee5",
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "typeVersion": 1,
      "position": [
        560,
        400
      ],
      "webhookId": "afe8deba-8749-4d3c-ae8f-c0887ba05770"
    },
    {
      "parameters": {
        "values": {
          "string": [
            {}
          ]
        },
        "options": {}
      },
      "id": "5c89b28b-523d-409f-8720-92a3ed8379ed",
      "name": "Set1",
      "type": "n8n-nodes-base.set",
      "typeVersion": 2,
      "position": [
        740,
        400
      ]
    },
    {
      "parameters": {
        "amount": 70,
        "unit": "seconds"
      },
      "id": "4fc941ff-4069-4aec-8549-f3c4d449c96f",
      "name": "Wait1",
      "type": "n8n-nodes-base.wait",
      "typeVersion": 1,
      "position": [
        920,
        400
      ],
      "webhookId": "a0f3ebfb-8d6d-4711-958a-0c90438ce3c6"
    },
    {
      "parameters": {
        "httpMethod": "POST",
        "path": "d7541d94-e26f-4495-9299-13f3055c9688",
        "options": {}
      },
      "id": "3736585e-2f82-4630-870c-a87efd30cf7c",
      "name": "Webhook",
      "type": "n8n-nodes-base.webhook",
      "typeVersion": 1,
      "position": [
        160,
        400
      ],
      "webhookId": "d7541d94-e26f-4495-9299-13f3055c9688"
    },
    {
      "parameters": {},
      "id": "4941ce6a-a3f5-4498-b50c-8eb7beb40c6c",
      "name": "No Operation, do nothing",
      "type": "n8n-nodes-base.noOp",
      "typeVersion": 1,
      "position": [
        1140,
        400
      ]
    }
  ],
  "connections": {
    "Set": {
      "main": [
        [
          {
            "node": "Wait",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait": {
      "main": [
        [
          {
            "node": "Set1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Set1": {
      "main": [
        [
          {
            "node": "Wait1",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Wait1": {
      "main": [
        [
          {
            "node": "No Operation, do nothing",
            "type": "main",
            "index": 0
          }
        ]
      ]
    },
    "Webhook": {
      "main": [
        [
          {
            "node": "Set",
            "type": "main",
            "index": 0
          }
        ]
      ]
    }
  }
}

also here is a screenshot

WhatsApp Image 2023-07-26 at 16 49 38

@netroy
Copy link
Member

netroy commented Jul 28, 2023

so far I'm unable to reproduce this on regular mode and queue mode, with the workflow linked above, and with a bunch of different other workflows with wait nodes with wait times longer than 65 seconds.

@stwonary
Copy link
Contributor

image

Hi. we are having the same problem. our waiting time is 3 minutes.

possible log trace relate to this issue:

(node:7) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 global:completed listeners added to [Queue]. Use emitter.setMaxListeners() to increase limit
(Use `node --trace-warnings ...` to show where the warning was created)
(node:7) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 global:failed listeners added to [Queue]. Use emitter.setMaxListeners() to increase limit
(node:7) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 global:completed listeners added to [Queue]. Use emitter.setMaxListeners() to increase limit
(Use `node --trace-warnings ...` to show where the warning was created)
(node:7) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 global:failed listeners added to [Queue]. Use emitter.setMaxListeners() to increase limit
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function
    at WaitTracker.getWaitingExecutions (/usr/local/lib/node_modules/n8n/dist/WaitTracker.js:83:56)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)
This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). The promise rejected with the reason:
TypeError: execution.waitTill.getTime is not a function

@stwonary
Copy link
Contributor

found something interesting

const findQuery: FindManyOptions<ExecutionEntity> = {
select: ['id', 'waitTill'],
where: {
waitTill: LessThanOrEqual(new Date(Date.now() + 70000)),
status: Not('crashed'),
},
order: {
waitTill: 'ASC',
},

image

Here, you can see some instances of waitTill with null values. These are causing the crashes.

@netroy could you please release an hotfix for 0.236.3 ? We are still not ready to migrate to 1.x version

@stwonary
Copy link
Contributor

maybe update using this code for the meantime?

for (const execution of executions) {
  const executionId = execution.id;
  if (this.waitingExecutions[executionId] === undefined) {
    if (execution.waitTill instanceof Date) {
      const triggerTime = execution.waitTill.getTime() - new Date().getTime();
      this.waitingExecutions[executionId] = {
        executionId,
        timer: setTimeout(() => {
          this.startExecution(executionId);
        }, triggerTime),
      };
    } else {
      Logger.error(`execution.waitTill is not a Date object for execution ID: ${executionId}. Value: ${execution.waitTill}`);
    }
  }
}

@netroy
Copy link
Member

netroy commented Oct 17, 2023

I think this was probably being caused by time-zone issues between n8n and postgres. If that's the case, This issue should be resolved in n8n versions 1.9.0 or later.
Can any of you upgrade n8n and check if this issues is fixed for you?

@netroy
Copy link
Member

netroy commented Oct 17, 2023

If you can't upgrade to 1.x, I've created a custom docker image with all the postgres related fixes: n8nio/n8n:fix-0.237.

@netroy
Copy link
Member

netroy commented Oct 18, 2023

One of the customers facing this issue has confirmed that the custom image, as well as upgrading fixes this issue.
If anyone is still seeing this issue on versions after 1.9, please let us know, and we can re-open this ticket to investigate further.

@netroy netroy closed this as completed Oct 18, 2023
@caiquezanetoni
Copy link

@netroy
I have this issue in version 1.11.2,
this only happens when using a Postgres node on the latest version

@caiquezanetoni
Copy link

@netroy
image

@netroy
Copy link
Member

netroy commented Oct 30, 2023

@caiquezanetoni Can you please share with us:

  1. what version of postgres is n8n database on (not the version of postgres the postgres node is talking to)?
  2. what timezone you are in?
  3. have you customized the timezone for postgres or n8n, or are they picking up the system defaults?

@janober
Copy link
Member

janober commented Jan 17, 2024

Fix got released with n8n@1.25.0

@DRIMOL
Copy link

DRIMOL commented Feb 6, 2024

This was only resolved for regular n8n, for those who have n8n queue with postgres it is not possible to update to version 1.25.1

https://community.n8n.io/t/n8n-worker-get-error-n8n-encryption-key-var-after-update-new-version/35878/3
https://community.n8n.io/t/error-when-upgrading-to-1-25-1/39631

@netroy
Copy link
Member

netroy commented Feb 16, 2024

@DRIMOL The links you posted are related to the encryption-key mismatch, and not related to the WaitTracker (which this issue is for).

@netroy
Copy link
Member

netroy commented Feb 16, 2024

For anyone still seeing this error, please upgrade to 1.29.1, which includes a fix for a bug in the postgresql driver where the timestamps were not converted to a valid Date object sometimes.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants