Skip to content

Proper way to upgrade salt-minions / salt-master packages without losing minion connectivity #7997

@shantanub

Description

@shantanub

We ran through the right way of doing this in salt training with Seth but I think I'm still missing something. I'm not sure if this is a bug or if I've missed something. I tried to run through the upgrade the master first / use salt to upgrade the minion service steps to upgrade from v.17 to v.17.1 of salt and ended up with losing access to most of my minions.

Long story short, I need a reliable way of upgrading all of the salt-minions and salt-master packages without losing access to the minions. From what I can tell, every time I perform such an upgrade I lose access to some if not all of my minions and need to login to each host/VM and restart the salt-minion package. This is doable in test/dev where we have 30 nodes being managed but not when I move this infrastructure to prod where I have over 200 nodes to manage. I need the upgrade path not to break the remote execution framework established between minions and master.

So without further ado here's what I did:

Update the master.

[root@salt-master ~]# yum list updates
Loaded plugins: security 
epel                                                                                                | 3.0 kB     00:00
epel/primary_db                                                                                     | 6.2 MB     00:00
epel-testing                                                                                        | 2.9 kB     00:00
epel-testing/primary_db                                                                             | 2.2 MB     00:00
rhel-localrepo                                                                                      | 3.0 kB     00:00
rhel-localrepo/primary_db                                                                           |  26 MB     00:00
Updated Packages         
glibc.x86_64                                         2.12-1.107.el6_4.5                                      rhel-localrepo
glibc-common.x86_64                                  2.12-1.107.el6_4.5                                      rhel-localrepo
glibc-devel.x86_64                                   2.12-1.107.el6_4.5                                      rhel-localrepo
glibc-headers.x86_64                                 2.12-1.107.el6_4.5                                      rhel-localrepo
java-1.6.0-openjdk.x86_64                            1:1.6.0.0-1.65.1.11.13.el6_4                            rhel-localrepo
kernel.x86_64                                        2.6.32-358.23.2.el6                                     rhel-localrepo
kernel-firmware.noarch                               2.6.32-358.23.2.el6                                     rhel-localrepo
kernel-headers.x86_64                                2.6.32-358.23.2.el6                                     rhel-localrepo
libtar.x86_64                                        1.2.11-17.el6_4.1                                       rhel-localrepo
nscd.x86_64                                          2.12-1.107.el6_4.5                                      rhel-localrepo
perf.x86_64                                          2.6.32-358.23.2.el6                                     rhel-localrepo
salt.noarch                                          0.17.1-1.el6                                            epel-testing
salt-master.noarch                                   0.17.1-1.el6                                            epel-testing
salt-minion.noarch                                   0.17.1-1.el6                                            epel-testing
setup.noarch                                         2.8.14-20.el6_4.1                                       rhel-localrepo
tzdata.noarch                                        2013g-1.el6                                             rhel-localrepo
tzdata-java.noarch                                   2013g-1.el6                                             rhel-localrepo
You have new mail in /var/spool/mail/root
[root@salt-master ~]# yum update -y

I restart the master and minion on my master VM.

[root@salt-master ~]# service salt-master restart
Stopping salt-master daemon:                               [  OK  ]
Starting salt-master daemon:                               [  OK  ]
[root@salt-master ~]# service salt-minion restart
Stopping salt-minion daemon:                               [  OK  ]
Starting salt-minion daemon:                               [  OK  ]

Try to upgrade some of my test minion VMs.

[root@salt-master ~]# salt 'salt-minion*' pkg.upgrade
[root@salt-master ~]# salt 'salt-minion*' pkg.list_upgrades

[root@salt-master ~]# salt -v 'salt-minion*' test.ping
Executing job with jid 20131021102016190263
-------------------------------------------

salt-minion-00:
    Minion did not return
salt-minion-01:
    Minion did not return

I login to each minion VM and restart the salt-minion service.

[root@salt-minion-01 ~]# service salt-minion restart
Stopping salt-minion daemon:                               [FAILED]
Starting salt-minion daemon:                               [  OK  ]
[root@salt-minion-01 ~]# chkconfig --list | grep salt-minion
salt-minion     0:off   1:off   2:on    3:on    4:on    5:on    6:off

Now I can ping the VMs again.

[root@salt-master ~]# salt -v 'salt-minion*' test.ping
Executing job with jid 20131021102229314417
-------------------------------------------

salt-minion-01:
    True                 
salt-minion-00:
    True  

Versions reports:

[root@salt-master ~]# salt --versions-report
           Salt: 0.17.1
         Python: 2.6.6 (r266:84292, May 27 2013, 05:35:12)
         Jinja2: 2.2.1
       M2Crypto: 0.20.2
 msgpack-python: 0.1.13
   msgpack-pure: Not Installed
       pycrypto: 2.0.1
         PyYAML: 3.10
          PyZMQ: 2.2.0.1
            ZMQ: 3.2.4

[root@salt-minion-00 ~]# salt-call --versions-report
           Salt: 0.17.1
         Python: 2.6.6 (r266:84292, May 27 2013, 05:35:12)
         Jinja2: 2.2.1
       M2Crypto: 0.20.2
 msgpack-python: 0.1.13
   msgpack-pure: Not Installed
       pycrypto: 2.0.1
         PyYAML: 3.10
          PyZMQ: 2.2.0.1
            ZMQ: 3.2.4

[root@salt-minion-01 ~]# salt-call --versions-report
           Salt: 0.17.1
         Python: 2.6.8 (unknown, Nov  7 2012, 14:47:45)
         Jinja2: unknown
       M2Crypto: 0.21.1
 msgpack-python: 0.1.12
   msgpack-pure: Not Installed
       pycrypto: 2.3
         PyYAML: 3.08
          PyZMQ: 2.1.9
            ZMQ: 2.2.0

You'll notice that the upgrade proceeded correctly. The packages were upgraded, but the salt-minion services were not restarted as a part of the upgrade process (for both minion VMs - one is RHEL5 and the other is RHEL6). Unfortunately, I didn't think to run the upgrade packages command in verbose mode at the time.

Do I need to find some external remote-execution method to restart all of the minions post-upgrade (mussh/omnitty, etc...)? This is probably not a bug but it's still very frustrating... I'm unlikely to upgrade again until I can figure out how to do this properly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1Priority 1documentationRelates to Salt documentationhelp-wantedCommunity help is needed to resolve this

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions