Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

process.argv behaves differently for pm2-runtime + cluster mode without pm2 daemon running #4950

Open
rocwind opened this issue Jan 1, 2021 · 5 comments

Comments

@rocwind
Copy link

rocwind commented Jan 1, 2021

What's going wrong?

execute pm2-runtime start process.json without a running pm2 daemon, "start", "process.json" is included in application process.argv.

How could we reproduce this issue?

  1. create an app.js with contents below:
console.log(process.argv)
  1. create a process.json with:
{
  "apps": [{
    "name": "test-server",
    "script": "app.js",
    "instances": "1",
    "exec_mode": "cluster"
  }]
}
  1. make sure pm2 daemon not running and execute pm2-runtime start process.json, it logs something like:
2021-01-01T17:29:41: PM2 log: Launching in no daemon mode
2021-01-01T17:29:41: PM2 log: App [test-server:0] starting in -cluster mode-
2021-01-01T17:29:41: PM2 log: App [test-server:0] online
[
  '/usr/local/bin/node',
  '/usr/local/lib/node_modules/pm2/lib/ProcessContainer.js',
  'start',
  'process.json'
]
  1. execute pm2 ps && pm2-runtime start process.json and it logs something like:
[PM2] Spawning PM2 daemon with pm2_home=/root/.pm2
[PM2] PM2 Successfully daemonized
┌────┬────────────────────┬──────────┬──────┬───────────┬──────────┬──────────┐
│ id │ name               │ mode     │ ↺    │ status    │ cpu      │ memory   │
└────┴────────────────────┴──────────┴──────┴───────────┴──────────┴──────────┘
[
  '/usr/local/bin/node',
  '/usr/local/lib/node_modules/pm2/lib/ProcessContainer.js'
]

pm2 ps is just for spawning the daemon process.

Supporting information

pm2, pm2-runtime + fork mode both works fine, it just the pm2-runtime.

--- PM2 report ----------------------------------------------------------------
Date                 : Fri Jan 01 2021 17:35:17 GMT+0800 (Central Standard Time)
===============================================================================
--- Daemon -------------------------------------------------
pm2d version         : 4.5.0
node version         : 14.15.1
node path            : not found
argv                 : /usr/local/bin/node,/usr/local/lib/node_modules/pm2/lib/Daemon.js
argv0                : node
user                 : undefined
uid                  : 0
gid                  : 0
uptime               : 0min
===============================================================================
--- CLI ----------------------------------------------------
local pm2            : 4.5.0
node version         : 14.15.1
node path            : not found
argv                 : /usr/local/bin/node,/usr/local/bin/pm2,report
argv0                : node
user                 : undefined
uid                  : 0
gid                  : 0
===============================================================================
--- System info --------------------------------------------
arch                 : x64
platform             : linux
type                 : Linux
cpus                 : Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
cpus nb              : 2
freemem              : 1295839232
totalmem             : 2612682752
home                 : /root
===============================================================================
--- PM2 list -----------------------------------------------
┌────┬────────────────────┬──────────┬──────┬───────────┬──────────┬──────────┐
│ id │ name               │ mode     │ ↺    │ status    │ cpu      │ memory   │
└────┴────────────────────┴──────────┴──────┴───────────┴──────────┴──────────┘
===============================================================================
--- Daemon logs --------------------------------------------
/root/.pm2/pm2.log last 20 lines:
PM2        | 2021-01-01T17:32:39: PM2 log: App name:test-server id:0 disconnected
PM2        | 2021-01-01T17:32:39: PM2 log: App [test-server:0] exited with code [0] via signal [SIGINT]
PM2        | 2021-01-01T17:32:39: PM2 log: pid=837 msg=process killed
PM2        | 2021-01-01T17:32:40: PM2 log: PM2 successfully stopped
PM2        | 2021-01-01T17:35:17: PM2 log: ===============================================================================
PM2        | 2021-01-01T17:35:17: PM2 log: --- New PM2 Daemon started ----------------------------------------------------
PM2        | 2021-01-01T17:35:17: PM2 log: Time                 : Fri Jan 01 2021 17:35:17 GMT+0800 (Central Standard Time)
PM2        | 2021-01-01T17:35:17: PM2 log: PM2 version          : 4.5.0
PM2        | 2021-01-01T17:35:17: PM2 log: Node.js version      : 14.15.1
PM2        | 2021-01-01T17:35:17: PM2 log: Current arch         : x64
PM2        | 2021-01-01T17:35:17: PM2 log: PM2 home             : /root/.pm2
PM2        | 2021-01-01T17:35:17: PM2 log: PM2 PID file         : /root/.pm2/pm2.pid
PM2        | 2021-01-01T17:35:17: PM2 log: RPC socket file      : /root/.pm2/rpc.sock
PM2        | 2021-01-01T17:35:17: PM2 log: BUS socket file      : /root/.pm2/pub.sock
PM2        | 2021-01-01T17:35:17: PM2 log: Application log path : /root/.pm2/logs
PM2        | 2021-01-01T17:35:17: PM2 log: Worker Interval      : 30000
PM2        | 2021-01-01T17:35:17: PM2 log: Process dump file    : /root/.pm2/dump.pm2
PM2        | 2021-01-01T17:35:17: PM2 log: Concurrent actions   : 2
PM2        | 2021-01-01T17:35:17: PM2 log: SIGTERM timeout      : 1600
PM2        | 2021-01-01T17:35:17: PM2 log: ===============================================================================

$ pm2 report
@rocwind rocwind changed the title process.argv behaves differently for pm2-runtime + cluster mode with or without pm2 daemon process.argv behaves differently for pm2-runtime + cluster mode without pm2 daemon running Jan 4, 2021
@mattpr
Copy link

mattpr commented Nov 29, 2022

Another data point on this...

nextjs crashes due to: Unknown or unexpected option: --no-daemon

when running nextjs via pm2 (pm2 5.2.2, node 16.15) also fails on ubuntu 20.04 LTS.

app.pm2.json

{
    "name"               : "app_name",
    "script"             : "/opt/app_dir/node_modules/next/dist/bin/next",
    "args"               : "start",
    "instances"          : "1",
    "exec_mode"          : "cluster",
    "cwd"                : "/opt/app_dir",
    "out_file"           : "/dev/null",
    "error_file"         : "/dev/null",
    "wait_ready"         : false,
    "listen_timeout"     : 5000,
    "kill_timeout"       : 30000,
    "max_restarts"       : 1000000,
    "restart_delay"      : 100,
    "max_memory_restart" : "1G",
    "watch"              : false
}

Run with:

HOME=/tmp node pm2 start --no-daemon /path/to/app.pm2.json

We are running the above and setting up the environment via a provisioned systemd service unit which is why we run with --no-daemon. Maybe most people are running pm2 by hand in production?

Added a console.log(process.argv); to the top of /opt/app_dir/node_modules/next/dist/bin/next to see what was going on and got the following...

[
    '/opt/nodejs/node-v16.15.0/bin/node',
    '/opt/nodejs/node-v16.15.0/lib/node_modules/pm2/lib/ProcessContainer.js',
    'start',
    '--no-daemon',
    '/path/to/app.pm2.json',
    'start'
]

As far as I can tell, this issue will break any node apps that rely on process.argv when running in pm2 in --no-daemon mode (e.g. when using custom systemd units).

@mattpr
Copy link

mattpr commented Nov 29, 2022

I thought I had solved it from some combination of setting additional env. But it was a just a coincidence. It doesn't look like environment variables has anything to do with this.

If I run pm2 ls first and then run pm2 start --no-daemon app.pm2.json the issue goes away but that isn't really a solution as instrumenting start/reload/restart in systemd reliably would be difficult.

According to the docs, args shouldn't be passed through to the script being run by pm2 unless they follow a --.

@mattpr
Copy link

mattpr commented Nov 29, 2022

Okay, so the problem is that the worker process is launched differently depending on whether a running daemon is detected running. The docs even hint at this:

Make sure you kill any PM2 instance before starting PM2 in no daemon mode (pm2 kill).

They mean kill any running Daemon (the "God" process). Running almost any pm2 command (like pm2 ls) will result in a God daemon starting in the current environment if one isn't already running (the configured .pm2 directory).

case: pre-existing Daemon ("God" process)

Basically, when the Client starts it does a pingDaemon to see if there is an alive Daemon process. If yes, it does a launchRPC and returns. So the whole no-daemon init code gets skipped. This is where the code paths diverge when a daemon is already running or not.

In this case the existing daemon process will field the RPC and does a prepare call to launch the missing application. Because this happens over RPC the original process.argv are not retained.

case: no Daemon found

In this case we hit the no-daemon init code.

If you have instances set or specified clustering mode, the script is ultimately started by cluster.fork in ClusterMode.

If you are not using clustered multi-instance mode, then see ForkMode which uses child_process.spawn.

The cluster is setup here. Basically it gets a default script to execute which is ProcessContainer.js. This guy just wraps our script as a ES or CJS module.

The thing to note here from the docs for cluster.settings is:

args <string[]> arguments passed to worker. Default: process.argv.slice(2)

args is NOT set by pm2 so in the default cluster.fork scenario, all of the original args (minus the first 2) will be sent along to the worker. This means the node and pm2 get chopped off the front and replaced with node and ProcessContainer but the rest of the original pm2 args stick around and get passed to the worker.

patching...

You can just set the default for the worker args on the cluster to be empty array. Any args specified in pm2 environment json will still get passed to the child worker.

If you are using the "Variadic" (--) feature to pass through args to the child then you might need to be a little fancier about which args you keep from process.argv.

Right before you cluster.fork

cluster.settings.args = [];  //  don't pass child args (any args from your pm2 json environment will get passed)

Or when cluster is initialized.

cluster.setupMaster({
  windowsHide: true,
  exec : path.resolve(path.dirname(module.filename), 'ProcessContainer.js'),
  args: []
});

mattpr added a commit to mattpr/pm2 that referenced this issue Nov 30, 2022
argv is now populated correctly when running in no-daemon mode.
@mattpr mattpr mentioned this issue Nov 30, 2022
@chalermpong
Copy link

Hi

I'm also facing this issue. I'm using pm2-runtime to start next.js server. I'm using pm2-runtime v5.2.0.

My config file is the following:

#app.yml
apps:
  - script: next
    args: start

I debug the process.argv when running pm2-runtime app.yml alone. There will be 4 items:

process.argv:  [
  '/Users/me/.nvm/versions/node/v18.16.0/bin/node',
  '/Users/me/.config/yarn/global/node_modules/pm2/lib/ProcessContainer.js',
  'app.yml',
  'start'
]

But if I run pm2 list first, then run pm2-runtime app.yml. There will be 3 items:

process.argv:  [
  '/Users/me/.nvm/versions/node/v18.16.0/bin/node',
  '/Users/me/.config/yarn/global/node_modules/pm2/lib/ProcessContainer.js',
  'start'
]

next.js searches command from process.argv. In the first case, it will incorrectly get app.yml as command.

@mattpr
Copy link

mattpr commented Oct 7, 2023

@chalermpong -- the open PR fixes this at least for no-daemon mode.
The PR has been open for a long time without any response from Unitech, there are 35 open PRs.
So I get the feeling this project isn't actively maintained, or at least they aren't interested in fixing or responding to known issues.

You can take my PR and make your own build, fix it yourself, or move away from using pm2. We are doing the latter because of a variety of issues running pm2 in a serious devops environment. It is great for developers that want to get a node app into "production" without knowing anything about running servers.

If you ask pm2 to just be a node process load balancer behind systemd without a global "god" process (daemon mode), then things seem to start falling apart. I'm guessing this is because the Unitech folks really don't test these edge cases very well because that isn't how they expect pm2 to be used by most folks.

We often have 5-10 different pm2 users/apps on a single box. Every app gets own user/home-dir/etc because there isn't any reason to give them all the same permissions and access to each others' data. Our developers/devops never interact with pm2 directly because they would likely accidentally try to run pm2 as their own user rather than the correct pm2 user for that particular app (permissions problems, missing ENV set by systemd, etc). Rather developers are limited to systemd restart|reload|start|stop <app-name> which ensures the correct home dir and user are used to keep pm2 happy. Even running something like pm2 ls appears to try and start a God process under the current user in the current user's home directory...so doing pm2 <any-command> directly is "dangerous" in production unless you are a single developer running all your apps under a single pm2 all running under your own user/login with pm2's config directory in your own user's home directory. Even if they run the right incantation to set the right pm2 HOME and user (e.g. HOME=/opt/node-pm2/data-puller sudo -E -u pm2-data-puller pm2 ls) there are still problems. For instance pm2 starting a god process automatically when running commands you expect to be "read" rather than "execute" (like ls) and the other issue is if you actually try to start the app this way, systemd won't know about it (breaks monitoring of systemd service states) and there may be required app environment that is set on the systemd unit that will be missing (this is a more generic mechanism for passing ENV than the pm2 ecosystem file which only supports pm2 apps).

For an alternative: you can run multiple instances of a single systemd service using an index number and having that index number passed into the unit file for instance to set/increment ports. Of course that only gets you so far, you also need a notion of health checks for your app, adding and removing individual app instances to a load balancer based on health checks, incremental rollout and validation of app deployments. However these topics are all quite individual to the app you are building and your devops tooling: what load balancer you use (LVS, HAProxy, nginx, ELB, hardware, etc), how you validate your apps health before passing more traffic. For us pm2 is just doing the node process load balancing (e.g. I have 2 cores and want 2 instances of the app running in round robin or with percentages of traffic). Doing deploys from CI assets, updating load balancer configs and doing health checks is just a small amount of scripting specific to your devops tools and nature of your apps.

Good luck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants