-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nodejs app that previously ran fine under pm2 will no longer start, even though same app with same args and user works #5767
Comments
Nearly forgot: Running Ubuntu 23.10 and Node v21.6.2 |
Unless someone comes up with a surprise solution to this one in the next couple of days, I think I'm going to reinstall the entire OS and everything from scratch, possibly while performing the lesser banishing ritual of the pentagram of Earth, while waving chicken bones at it, then hoping the problem doesn't re-appear - then I'll try to forget it happened. I figure this is either something ridiculously simple and silly that I did, or it's some darn thing stuck in an arcane location that is messed up, and not fixed by standard purging and reinstalling. |
I mean, other than that, my thoughts on pm2 are basically "where have you been all my life? Where were you when I was still working?" : ) I feel like the problem is likely not with PM2 per se, I'm figuring either I'm missing something, or something got corrupted somewhere, and the problem shows up with PM2, but other launch methods don't trigger whatever it is for some reason. I don't know. It's weird. |
Okay, this one was dumb. Apparently, while doing some debugging, I got the error.log file for that server stun-locked. When it would try to write a warning message, it would apparently hit the log and die, instead of writing to it or failing politely. A victory for science. Not ghosts. |
And it wasn't PM2's fault, there was a file path generating a warning, that was only happening with that server. |
What's going wrong?
First: Apologies for this being so long. It's weird and difficult to succinctly describe, and I wanted to make sure I provided as complete a description and supporting files as possible.
This one is weird.
I run a virtual tabletop app named Foundry vtt. Previously, I'd always written a small systemd service file to launch the nodejs apps as a service, but I wanted to give PM2 a go, both because it's a little friendlier, and because it's so easy to set up clusters.
Anyway, I have multiple copies of this thing running. Each is completely identical except for the name, and they run in a different folder tree. Each is essentially an identical copy, just running under different subdomains.
Well, at some point, after I stopped and then restarted the services in PM2, one just refuses to function or make any connections. If I run the exact same nginx startup script either manually from the command line, or as a systemd service, it works flawlessly, and it had been running for days, through multiple restarts of the machine before it stopped working via pm2.
It's running the app as the same user, with the exact same parameters. I literally can copy the command line used to start the service, remove the pm2 bits, and it runs fine. I have even tried removing first the application's app directory, then it's data directory and replacing them with default, and it will not run. If I target the other application's code and data, with the same name, it won't run. But it also won't run the troublesome one if I change the name, so it doesn't appear to be a namespace issue.
When I load the site, it returns a 504 error, and the nginx logs say the connection was refused, but it loads the favicon. And only the favicon.
I'm not sure if I need tech support or an exorcist.
How could we reproduce this issue?
I am not sure. Presumably by using my haunted server. I would not be surprised it it is just happening to me, and that the error originated between my monitor and chair (i.e., I'm doing something screwy, and can't figure out what).
Supporting information
The app I'm working with has an application directory (of course), plus a data directory that can be placed arbitrarily wherever you like. My setup looks like:
App Directories:
(note: These are all identical copies, copied and with identical permissions and owner (0775/root, but I also tried running not as root with things set to group html-data) set with sudo chown/chmod -fR, and compared via diff and visually to ensure they are absolutely identical with identical permissions).
Data Directories:
Hosts file:
The pm2 commands used to launch are:
Run the following commands
$ pm2 report
When I go to any of the above sites, except the sadboys one, they work fine. If I go to the sadboys one, I get 502 Bad Gateway nginx/1.24.0 (Ubuntu)
If I launch the exact same app, using:
That works fine, and as expected.
If I instead activate the app via a systemd service like this (again, all are identical, except for the one word, vtt/sadboys/twb/bacon), it runs without issue:
That also works fine.
Just that copy of the app will not run, named as sadboys, or as any other name. The other identical ones work fine.
Also, all of them were working fine previously.
nginx config files linked from from sites-enabled:
default.txt
bacon.mydomain.org.conf.txt
sadboys.mydomain.org.conf.txt
twb.mydomain.org.conf.txt
vtt.mydomain.org.conf.txt
I can run this app as literally everything but a pm2 service, even though I run several functionally-identical ones, and the same service ran for days with no problem previously.
I even tried uninstalling and reinstalling pm2, and even reinstalling everything with node version manager, then a fresh install of pm2. No dice.
It's like it hates this one install. It's siblings are fine, but it hates this one.
And yes, I have also tried changing the name (incl references within the file), and it still fails. I have also tried launching with an empty/default data directory, to make sure it's not size or content of the data, and even running it against the application/main.js from another install. It will not run, it if even thinks I'm trying to run the sadboys instance.
I kid you not. I tried replacing first the app directory. Then the data directory. Then even the name, so that by the end, it had different (otherwise functioning) app directories and data directories and name and it still would not work. It felt like that old joke about if you replace the ax, the axe handle, and the axe grip, is it still the same axe?
I would absolutely be delighted if someone can help me sort this out, even it it's that I've done something embarrassingly silly and it's all my fault. I don't mind looking foolish if you can at least stop making me think my box has bad faeries living in it or something. At this point, I just want it to make sense! : )
Thoughts?
The text was updated successfully, but these errors were encountered: