-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Waiting Executions #8136
Comments
Timezone configs: Worfkflow is on default |
Hey @arthurrferroni thank you for reporting - we saw this issue before but could never reproduce it. Do you have database access? Can you check the |
Yes I do, but I think is saving correctly, if I restart n8n's pm2, the executions that have already passed the scheduled time and are in waiting status are executed when starting pm2. i've excluded the ID 594672, but i created a new one id 596515, and is teh row of the database 596515 false "manual" "2023-12-22 12:10:38.888+00" "2023-12-22 12:10:38.896+00" "2023-12-22 08:59:38.896+00" "waiting" "TB3PVn7m7Tdf6IVU" |
Hey @arthurrferroni, Out of interest do you get the same issue if you use the Docker image or Node 18 which is our current recommended version? Can you share the configuration for Postgres as well? Are you self hosting that as well and does it have any timezone configurations or was it just an apt install postgres. |
Hello @janober @Joffcom, I hope 2024 is another year of health, achievements and success in your life and your family. I've Same problem here. I was able to test up to version 1.14 and the same problem. I am using Docker Swarm, N8N in Queue Mode. If the Wait is less than 60 seconds it works perfectly, if you set it to 65 seconds or more it waits 60 seconds and considers the workflow completed, but the nodes following the Wait are not executed. I did the same test using RabbitMQ (I use it in all my workflows), if there is a Wait within a SubWorkflow it considers it completed as well. No errors are displayed or logs are generated. Last week I did tests with my friend @cbalbinos and we came to the conclusion that if the wait takes more than 65 seconds, N8N considers it to have been executed successfully and does not execute the rest of the nodes. He opened #8167 . I belive it is related to #7699 |
eu tenho o mesmo problema, segue o post |
Hey @luizeof, If a wait is over a certain time it goes into the database in a waiting state but the workflow isn't really considered "finished", From the testing I did this morning and from the images in #8167 we can see that the waiting executions are working. In your test can you share any error output you are getting in the logs as the issue with RabbitMQ is not the same as the issue reported here and I don't want this issue to be mixed up with different issues being reported. |
@RuanMD do you see the same error message in your log? |
I don't know where the error is, but I did several tests, when the wait is longer than 65 seconds it just stays waiting and doesn't execute the automation |
Hey @RuanMD, The log will be in the docker output of n8n, I would not be surprised if there are a couple of different issues here. |
I haven't tested with node 18, only 20, running on pm2, the execution was saved with waiting status, not finished, but when the time set expires doesn't execute the next nodes, and logs the error on the pm2 logs. |
Hey @arthurrferroni, Are you able to test with 18 which is the version we officially support? Do you also have your database configuration handy? |
I'll up another vps and taste this, I'm running on postgres SQL, 6cpu and 16gb ram |
@RuanMD if you are using docker it will be node 18 already, in your case what I want to know is if you see the same error in the docker logs although we already have someone internally looking at your report so it may be worth keeping to your post as it could be a different issue and we may end up repeating things. |
hello @Joffcom I was able to reproduce the latest version: I set 85 seconds of waiting. The workflow run by the editor works well. In Production it is not executed. I'm running in queue mode:
|
Hey @luizeof, If you look at your error message it is not the same as the one earlier in this topic, But looking at that error I have to ask... Was the workflow saved before you ran it? |
the workflow was saved. |
Hey @luizeof, That is interesting, So I can see your timezone is Sao Paulo and I know another report of this is also using that Timezone. I wonder if that is related as I am not able to reproduce this with Europe/London with either the unsaved or the other error message. I will do some testing on Monday and see if it is timezone related. |
@Joffcom make some tests changing this
|
@luizeof We already know one of the issues will be coming from where the data is saved to the database to be picked up again later the bit we don't know is why it can't always be reproduced and why there are different errors for example your error is n8n saying the workflow was not saved the original error in this workflow is around something not being a function that should be. Once we can reproduce the different errors in here we will be able to resolve it. |
thank you luiz |
The fix in 1.22.4 fixes something that I think was introduced in 1.22.2 or 1.22.3 so if you were seeing this before that this change should make no difference 🤔 |
@luizeof what version of postgres are you running? |
postgres 16 |
Thanks. Can you also please share the output of this query 🙏🏽 SELECT status, "waitTill" FROM "execution_entity" WHERE "waitTill" IS NOT NULL; |
Quick update, the issue originally reported in this post has been seen in If your log file does not mention that |
Hey @DRIMOL and @luizeof interesting when you mention that using Recife's timezone solved it - leads me to think that there's something wrong with the São Paulo timezone. Daylight saving time has been extinguished since 2019. I'm intrigued whether this could be the issue, like times being broken because of this. |
The report I have been looking into is using Berlin, but lets not forget again that there is likely more than one issue here. |
I didn't test SP, I just used Recife because it's closer to me, but what I changed that resolved it was that I only used the parameter GENERIC_TIMEZONE=America/Recife, when I put the second TZ=América/Recife it worked, you can call me on zap to exchange an idea if you want +557999977-1721 |
I tested it again now and it's not working, this has happened other times, apparently as soon as I deploy it to my n8n queue stack it works, but after a few hours it doesn't work again |
So an update on this issue... Can you upgrade to |
my editor doesn't stop loading when I put it in version 1.25.0 I had to revert to version 1.22.6 to get it working again, about the logs I have a question in my case as I use n8n queue I have 6 work how will I know which one will is the error log? Gravacao.de.Tela.2024-01-20.as.11.09.01.mov |
Hey @DRIMOL, That error is odd and unexpected, Does the browser dev console show any errors? When it comes to checking the log output I would check all of the workers and the main instance to see what is there. |
I've upgraded to version 1.27.1. What do I need to do? I am not a programmer :( |
Can you enable debug logging and share the output? |
here follows the log https://gist.github.com/DRIMOL/6e963cccbac8c814b9f6203b1e119ecd |
1 - I just did a test here and setting the workflow time zone to UTC worked correctly 2 -Time difference error showing in the editor 3 - I believe that these errors reported in this link below also have to do with the same bug |
I think there are still different issues here as in your log snippet I couldn’t see the message originally reported or the text from the fix we put in place so I suspect the wait has not executed or there was no issue. The times seen in the ui can vary sometimes if n8n is set to one locale with the TZ env option and an external database is in use with a different TZ option set. There is another issue opened about this. |
I believe it's the same error, just in the log it stopped displaying the same message, my regular n8n after I returned to version 1.24.1 apparently has been working correctly since yesterday, but for that to happen, I had to put it in version 1.25 . 1 and then go back to 1.24.1, I didn't understand how this happened, since before in version 1.24.1 I had the error. As for the N8n queue, I haven't been able to test it yet in version 1.25.1 because this error occurred when updating and I can't touch it anymore, luckily I didn't update what I use in production, just a test one. https://community.n8n.io/t/erro-ao-atualizar-para-1-25-1/39631 |
The fix we put in for the original issue will output a special log message if that specific condition is hit which I didn’t see in your log file which is what I am basing my thoughts on so it makes sense to me that if we are not seeing that message it is not the same cause or issue. Annoyingly with this one we are not able to actually reproduce it and we have not yet been given a configuration that reproduces it the 3 I have tried all work as expected which is very odd. |
https://gist.github.com/DRIMOL/d86ff759ade0b80098f21d408650d78d It didn't work again on the regular n8n, as you can see in the logs above, it is in version 1.24.1, but after I downgraded it took more than 24 hours to give the error again, I believe it's not a matter of time and if of amount of work on the n8n, something causes it to stop working after some amount of use, perhaps this is a reason for you not being able to reproduce the error, it is necessary that the n8n is in production and in constant use to reproduce the error. I will update to version 1.25.1 |
@DRIMOL I would expect the issue to be in 1.24.1 as we didn't put the fix in until after that release. My instance of n8n handles thousands of production and gets executions a day and our internal instance handles a lot more so I suspect I would see a usage limit if that was the case but as I mentioned your log from the newer version does not contain the log output for the fix for the original issue. This is why I have been saying there is likely to be 2 different issues here. As you are able to reproduce this can you provide your n8n configuration so we can test it along with the external database configuration? We know from the original message what the error is which is how we were able to fix it but if your logs do not output that error or the message about the fix then it is to be something else, the more information we can get the more likely it will be that we can actually fix this second issue. |
https://gist.github.com/DRIMOL/a08f6c0bdda5c74e27b292024c5828a1 This is the Stack I use for my n8n queue, but I can't update to version 1.25.1, because in some tests I did on another n8n in the same stack, I get an error if I need to downgrade the worker doesn't start as I reported here , |
@DRIMOL if you can reliably reproduce this, would it be possible for you to reproduce this locally on your personal computer? if yes, would it also be possible for you to jump on a call with us to help us debug this? This is likely a weird random bug in the postgres driver code, that's very difficult to fix without being able to debug it. |
The problem is the following, I did another installation of n8n with the same stack and on the same VPS from the company HOSTHATCH with only streams to try to catch the error, but it doesn't reproduce even in versions prior to 1.25.0 so I believe that if I do A local installation on my Mac also won't reproduce the error, would it be possible, we set up a video call and I'll show you everything you need on the two n8n that are in production, one regular and one queued, or The only way would be for me to reproduce the bug on my local machine? |
The issue isn't that I don't believe you. I actually do. We really need a way to reproduce this in a non-production environment. |
@DRIMOL Would it be possible for you to run a custom docker image? |
@DRIMOL Can you please try setting |
Hey @DRIMOL can you send me an email to omar@n8n.io? We can try to set up a call. You can write in Portuguese as I am also brazilian :) |
For anyone still seeing this error, please upgrade to |
Describe the bug
if a workflow has a Wait node, and the status is WAITING(> 65 seconds), n8n does not execute and has errors in the log
To Reproduce
1 - create a basic workflow with a node Wait, set it to 5 minutes.
2 - this will not execute and print a log error
Expected behavior
Expected to executed on the seted time
Environment (please complete the following information):
Additional context
The text was updated successfully, but these errors were encountered: