Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PM2 reload vs start vs resurrect in systemd #2914

Closed
dylancwood opened this issue Jun 1, 2017 · 11 comments
Closed

PM2 reload vs start vs resurrect in systemd #2914

dylancwood opened this issue Jun 1, 2017 · 11 comments

Comments

@dylancwood
Copy link

What's going wrong?
There is no bug, this is a question about best practices. Please see the background and questions below. I sincerely appreciate any time spent on helping me to understand PM2 better!

Supporting information

Background

(My understanding)
PM2 offers a startup command, which will install a systemd unit file.
The unit file configures systemd to start PM2 on boot using the pm2 resurrect command.
The resurrect command reads a dump file in PM2_HOME, which records the state of PM2 managed processes: PM2 settings, environment vars, etc...
The dump file is created when pm2 startup is run, and when pm2 save is run.

Our servers use a single PM2 manifest to manage all node.js processes (usually just one process).
We are interested in directly starting PM2 using pm2 reload /path/to/pm2/manifest.json --no-daemon from within the systemd unit file, rather than relying on the resurect/dump-file system.

Here is an example systemd unit file that we might use:

[Unit]
Description=PM2 process manager
Documentation=https://pm2.keymetrics.io/
After=network.target

[Service]
User=app
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=8
Environment=PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/home/app/.pm2
Restart=always
RestartSec=8

ExecStart=/usr/lib/node_modules/pm2/bin/pm2 reload /services/pm2_config.json --no-daemon
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload all
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 kill

[Install]
WantedBy=multi-user.target

Reasons for avoiding the dump file and resurrect are:

  1. It does not play well with our server provisioning system (see below).
  2. We are concerned about unintentionally preserving state like environment vars within the dumpfile.

Questions

  1. Are there any issues that the community has already run into with this approach? For instance, perhaps resurrect has some additional protections that make it play well with systemd.
  2. Why is this not the standard approach? I understand that pm2 can be used to manage many microservices on the same machine, and that pm2 save can capture a heterogeneous environment well, but having to call pm2 save as part of our deployment procedure doesn't sit well.
  3. When do others call pm2 save? The only way that we can think of to get this to work with our provisioning and deployment system is to make our deployment bounce command be pm2 reload && pm2 save. This feels odd, so we would like to know what others are doing.

More background than you probably wanted

This model for integration with systemd is tricky to integrate with our server provisioning system. During provisioning, we use Puppet to install Nodejs, npm and PM2. We also place a standardized pm2 manifest on disk that will be used to start the node services once the source code is deployed. This is the time when we would ideally run pm2 startup to setup the systemd files, however, our services have not been started with PM2 yet.

That's not a problem really, as we can just run pm2 save after running pm2 reload as part of our deployment. Thus, the command to bounce our services becomes pm2 reload {{/path/to/pm2/manifest}} && pm2 save.

Still, the team and I have decided that it would be simpler to hard-code pm reload /path/to/pm2/manifest.json into the systemd unit file, than to add pm2 save to our deployment process.

PM2 version: 2.4.6
Node version: v4.5.0
OS: Linux (CentOS 7)
@soyuka
Copy link
Collaborator

soyuka commented Jun 1, 2017

We are concerned about unintentionally preserving state like environment vars within the dumpfile.

We should definitely do something about this.

Why is this not the standard approach? I understand that pm2 can be used to manage many microservices on the same machine, and that pm2 save can capture a heterogeneous environment well, but having to call pm2 save as part of our deployment procedure doesn't sit well.

Usually this is how I do, and I'd say it's how pm2 is supposed to work:

  1. pm2 start foo.js pm2 start bar.js
  2. Run only once pm2 save, as long as script paths haven't changed there is no reason to launch this a second time.
  3. pm2 startup, same only have to run once.

Are there any issues that the community has already run into with this approach? For instance, perhaps resurrect has some additional protections that make it play well with systemd.

I don't had any, and nope resurrect is straightforward and just reads the dump to start processes on startup.

When do others call pm2 save? The only way that we can think of to get this to work with our provisioning and deployment system is to make our deployment bounce command be pm2 reload && pm2 save. This feels odd, so we would like to know what others are doing.

Indeed this is odd. You shouldn't need to launch pm2 save more than once (except if your script paths have changed). About environment variables I don't think that they should be saved in the dump file, and I'm pretty sure that there's no point (needs more investigation). Indeed, the pm2.dump file should only help in restarting the scripts.

Still, the team and I have decided that it would be simpler to hard-code pm reload /path/to/pm2/manifest.json into the systemd unit file, than to add pm2 save to our deployment process.

Because you have a single manifest.json, I'd also agree that this is the easiest/cleanest solution. Especially if you want to keep the following behavior:

We are interested in directly starting PM2 using pm2 reload /path/to/pm2/manifest.json --no-daemon from within the systemd unit file, rather than relying on the resurect/dump-file system.

In your case I may even go further by not using pm2 startup and by providing my own systemd file.

@ghost
Copy link

ghost commented Jun 1, 2017

Hi Dylan,

We use a similar set up to deploy our nodejs microservices (1 microservice per instance). We use a Centos AMI and we provision the instance with puppet. Our instances are in an autoscaling group so they're able to be scaled up/down where we need to with no issues.

We manage the node service with pm2, and we manage pm2 via a systemd service. We have a templated systemd service file, which we place in /etc/systemd/system/pm2.service with puppet, and then puppet ensures that that service is running. We also template the manifest.json too.

We don't use the startup/resurrect behaviour as it doesn't fit in with our use case. Once you have the service running, to update we just do a yum update node-service && systemctl reload pm2, this will do a graceful reload of the service.

Its important to treat pm2 and the node service as its own app. When you normally start pm2 for example, it will create its HOME (and its various log/dump files) inside the user who ran it. This doesn't lend itself well to automation, however its easy to overcome.

For example, we set the PM2_HOME to be /etc/$node_service_name, logs go to /var/log/$node_service_name and and the pid file to be /var/run/$node_service_name.

Here's a copy of our pm2.service file

[Unit]
Description=PM2 process manager
Documentation=https://pm2.keymetrics.io/
After=network.target

[Service]
User=nodejs
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=8
Environment=PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/etc/pm2
Restart=always
RestartSec=8

ExecStart=/usr/lib/node_modules/pm2/bin/pm2 start /etc/<%= @service_name %>/ecosystem.json --no-daemon
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload /etc/<%= @service_name %>/ecosystem.json
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 /etc/<%= @service_name %>/ecosystem.json

[Install]
WantedBy=multi-user.target

and our ecosystem.json (or in your case, the manifest.json

{
    "name"        : "<%= @service_name %>",
    "script"      : "index.js",
    "instances"   : "2",
    "exec_mode"   : "cluster",
    "cwd" : "/opt/<%= @service_name %>/",
    "log_file": "/var/log/<%= @service_name %>/<%= @service_name %>.log",
    "out_file": "/var/log/<%= @service_name %>/<%= @service_name %>.out",
    "err_file": "/var/log/<%= @service_name %>/<%= @service_name %>.err",
    "pid_file": "/var/run/<%= @service_name %>/<%= @service_name %>.pid"
}

This is all pretty simple to do via puppet. Install nodejs, npm, pm2 and the application package. Create the pm2.service file (with the template), the directories we use, and then the .json file. Then just ensure the pm2 service is running.

Let me know if you have any more questions. We weren't exactly sure of how to do this either, but we've been running and deploying nodejs services this way for a couple of months now and it seems to be doing well.

@soyuka
Copy link
Collaborator

soyuka commented Jun 1, 2017

@cam-itv this is really nice job there! Would you mind if I add some of your examples to the official documentation to help further users? Thanks for taking the time to share ❤️.

@ghost
Copy link

ghost commented Jun 1, 2017

@soyuka Thanks! Yeah sure, go for it!

I'm not a systemd expert, so there might be better ways to do it.

A couple of caveats, for the ExecReload, using pm2 startOrGracefulReload might be better (especially if your application handles SIGINT properly. Also, if you want to use pm2 via the command line, you'll have to export the PM2_HOME to your current shell session.

@vmarchaud
Copy link
Contributor

It's definitely better to make a custom unit file that better fit your needs.

The pm2 startup based approach is mainly designed for people that doesn't know how their OS behaving to start processes at the startup and in this case letting our whole process configuration on your disk isn't that much problematic.

@soyuka
Copy link
Collaborator

soyuka commented Jun 1, 2017

Still @vmarchaud any reason we also dump env variables on pm2 save?

@vmarchaud
Copy link
Contributor

vmarchaud commented Jun 1, 2017

Because generally you expect that your process state is the same after a reboot than when you started the process.
EDIT : specially when you inject environment variable with the CLI

@dylancwood
Copy link
Author

Thank you for the thoughtful replies everyone (especially @cam-itv). We will use an approach that is very similar to that of @cam-itv. Here are our files for reference. The main difference is that our pm2 manifest (ecosystem.json) is always in the same location, so there is nothing dynamic about the systemd unit file.

Regarding startOrGracefulReload, I believe that one can just call reload as of pm2 v2.

puppet/modules/common/service/templates/pm2-systemd.service.erb

[Unit]
Description=PM2 process manager
Documentation=https://pm2.keymetrics.io/
After=network.target

[Service]
User=app
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TimeoutStartSec=8
Environment=PATH=/usr/bin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Environment=PM2_HOME=/home/app/.pm2
Restart=always
RestartSec=8

ExecStart=/usr/lib/node_modules/pm2/bin/pm2 reload <%= @pm2_manifest_path %> --no-daemon
ExecReload=/usr/lib/node_modules/pm2/bin/pm2 reload all
ExecStop=/usr/lib/node_modules/pm2/bin/pm2 kill

[Install]
WantedBy=multi-user.target

puppet/modules/common/service/templates/pm2_config.service.erb (pm2 manifest)

{
  "apps": [
    <% @service_cfg.each do |app_name, app_cfg| %>
    {
      "name"           : "<%= app_name %>",
<% unless app_cfg['interpreter'].nil? -%>
      "interpreter"    : "<%= app_cfg['interpreter'] %>",
<% end -%>
      "script"         : "<%= app_cfg['start_script'] %>",
<% unless app_cfg['args'].nil? -%>
      "args"           : "<%= app_cfg['args'] %>",
<% end -%>
      "merge_logs"     : true,
      "error_file"     : "<%= @log_directory %>/<%= @service_name %>/<%= app_name %>_pm2.log",
      "out_file"       : "<%= @log_directory %>/<%= @service_name %>/<%= app_name %>_pm2.log",
      "exec_mode"      : "cluster",
      "instances"      : <%= @instance_count %>,
      "wait_ready"     : true,
      "listen_timeout" : <% unless app_cfg['listen_timeout'].nil? -%>"<%= app_cfg['listen_timeout'] %>"<% else -%>15000<% end -%>,
      "kill_timeout"   : <% unless app_cfg['kill_timeout'].nil? -%>"<%= app_cfg['kill_timeout'] %>"<% else -%>60000<% end -%>,
    },
    <% end -%>
  ]
}

An example service_cfg array would look like this:

$service_cfg = {
  app => {
    'start_script'   => '/services/services/start.js',
    'args'           => '-c /services/services/cfgs/build/mobile-api.json',
    'instance_count' => 2,
  },
  ops_broadcast => {
    'start_script'   => $broadcast_path,
    'interpreter'    => $sh_path,
    'args'           => '-c /services/services/cfgs/build/broadcast_ops_mobile-api.json',
    'instance_count' => 1,
  },
  request_broadcast => {
    'start_script'   => $broadcast_path,
    'interpreter'    => $sh_path,
    'args'           => '-c /services/services/cfgs/build/broadcast_req_mobile-api.json',
    'instance_count' => 1,
  },
  log_broadcast => {
    'start_script'   => $broadcast_path,
    'interpreter'    => $sh_path,
    'args'           => '-c /services/services/cfgs/build/broadcast_log_mobile-api.json',
    'instance_count' => 1,
  },
}

@dylancwood
Copy link
Author

Closing this issue since I've gotten enough useful feedback. Thanks again to everyone!

@soyuka: feel free to use our configs as examples as well. I'd also be happy to contribute a puppet + pm2 tutorial if you want. Ping me on slack if you're interested.

@vmarchaud
Copy link
Contributor

Just adding that gracefulReload is deprecated in favor of reload

@ghost
Copy link

ghost commented Jun 2, 2017

Ah ok, thanks for that - good to know 😄

alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Sep 23, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Oct 5, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Oct 24, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Nov 14, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Nov 22, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 2, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 2, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 11, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 11, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 14, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 29, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 29, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 29, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
alanorth added a commit to ilri/rmg-ansible-public that referenced this issue Dec 29, 2023
It seems my difficulty with start / stop was not misplaced. PM2
wants to save the state and this is problematic when managing the
starting and stopping via systemd. Running without the PM2 daemon
seems to work better.

Also, I am removing Type=forking, which will cause this to use the
simple type. Not sure about that, but it seems to be what people
are using with PM2 and systemd.

See: Unitech/pm2#2914
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants