Minions apparently doing *way* too much (crypto) work #2359

Closed
madduck opened this Issue Oct 26, 2012 · 5 comments

Comments

Projects
None yet
4 participants
@madduck
Contributor

madduck commented Oct 26, 2012

One of the machines I am currently using to evaluate salt doesn't have much
Oomph. It can act as a OpenVPN gateway and sustain 50Mbps across the tunnel,
so it's really not that bad, but for instance, salt key generation on first
run takes on the order of 20 seconds.

Starting the minion reveals some interesting stuff:

# salt-minion --log-level=trace 2>&1 | ts
Oct 26 10:55:43 [INFO    ] Loaded configuration file: /etc/salt/minion
Oct 26 10:55:43 [WARNING ] Setting up the Salt Minion "clegg.lehel.madduck.net"
Oct 26 10:55:44 [DEBUG   ] Attempting to authenticate with the Salt Master at 77.109.139.93
Oct 26 10:55:44 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:45 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:55:45 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:47 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:55:47 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:49 [INFO    ] Authentication with master successful!
Oct 26 10:55:49 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:50 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:55:50 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:53 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:55:53 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:55 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:55:57 [INFO    ] Minion is starting as user 'root'
Oct 26 10:55:57 [DEBUG   ] Minion "clegg.lehel.madduck.net" trying to tune in
Oct 26 10:55:57 [DEBUG   ] Minion PUB socket URI: ipc:///var/run/salt/minion_event_31785418bf63d3d77308ed0266112dd2_pub.ipc
Oct 26 10:55:57 [DEBUG   ] Minion PULL socket URI: ipc:///var/run/salt/minion_event_31785418bf63d3d77308ed0266112dd2_pull.ipc

It seems like it's loading the minion key 7 times and decrypting the master
AES key 4 times, and each of those takes between 1 and 3 seconds, while the
minion process is at 100 % CPU load. Wouldn't loading/decrypting each key once
suffice?

This is not only the case during startup, a simple highstate call to this
machine takes more than a minute to complete (all it does is ensure absence of
a file and creation of another one, really just trivial):

# time su salt -s /bin/sh -c 'salt clegg\* state.highstate'
clegg.lehel.madduck.net:
----------
State: - file
Name:      /etc/motd.tail
Function:  absent
    Result:    True
    Comment:   File /etc/motd.tail is not present
    Changes:   
----------
State: - file
Name:      /etc/motd
Function:  managed
    Result:    True
    Comment:   File /etc/motd is in the correct state
    Changes:   


real    1m3.772s
user    0m0.252s
sys     0m0.092s

Here is the corresponding trace output

[now executing the salt job on the master]

Oct 26 10:56:07 [INFO    ] User salt Executing command state.highstate with jid 20121026105606943737
Oct 26 10:56:07 [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20121026105606943737', 'tgt': 'cleg*', 'ret': '', 'user': 'salt', 'arg': [], 'fun': 'state.highstate'}
Oct 26 10:56:07 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:09 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:09 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:11 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:11 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:14 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:15 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:15 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:17 [INFO    ] User salt Executing command saltutil.find_job with jid 20121026105616967018
Oct 26 10:56:17 [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20121026105616967018', 'tgt': 'cleg*', 'ret': '', 'user': 'salt', 'arg': ['20121026105606943737'], 'fun': 'saltutil.find_job'}
Oct 26 10:56:17 [INFO    ] Executing command 'ps -efH' in directory '/root'
Oct 26 10:56:17 [DEBUG   ] output: UID        PID  PPID  C STIME TTY          TIME CMD
[…]
Oct 26 10:56:17 [INFO    ] Returning information for job: 20121026105616967018
Oct 26 10:56:17 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:17 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:20 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:22 [INFO    ] Loading fresh modules for state activity
Oct 26 10:56:23 [INFO    ] Fetching file 'salt://top.sls'
Oct 26 10:56:23 [DEBUG   ] Jinja search path: '/var/cache/salt/files/base'
Oct 26 10:56:23 [DEBUG   ] Syncing all
Oct 26 10:56:23 [INFO    ] Creating module dir '/var/cache/salt/extmods/modules'
Oct 26 10:56:23 [INFO    ] Syncing modules for environment 'base'
Oct 26 10:56:23 [INFO    ] Loading cache from salt://_modules, for base)
Oct 26 10:56:23 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:23 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:23 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:25 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:25 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:27 [INFO    ] User salt Executing command saltutil.find_job with jid 20121026105627216015
Oct 26 10:56:27 [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20121026105627216015', 'tgt': 'cleg*', 'ret': '', 'user': 'salt', 'arg': ['20121026105606943737'], 'fun': 'saltutil.find_job'}
Oct 26 10:56:27 [INFO    ] Executing command 'ps -efH' in directory '/root'
Oct 26 10:56:27 [DEBUG   ] output: UID        PID  PPID  C STIME TTY          TIME CMD
[…]
Oct 26 10:56:27 [INFO    ] Returning information for job: 20121026105627216015
Oct 26 10:56:28 [INFO    ] Caching directory '_modules' for environment 'base'
Oct 26 10:56:30 [DEBUG   ] Local cache dir: '/var/cache/salt/files/base/_modules'
Oct 26 10:56:30 [INFO    ] Creating module dir '/var/cache/salt/extmods/states'
Oct 26 10:56:30 [INFO    ] Syncing states for environment 'base'
Oct 26 10:56:30 [INFO    ] Loading cache from salt://_states, for base)
Oct 26 10:56:30 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:30 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:30 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:32 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:32 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:35 [INFO    ] Caching directory '_states' for environment 'base'
Oct 26 10:56:35 [DEBUG   ] Local cache dir: '/var/cache/salt/files/base/_states'
Oct 26 10:56:35 [INFO    ] Creating module dir '/var/cache/salt/extmods/grains'
Oct 26 10:56:35 [INFO    ] Syncing grains for environment 'base'
Oct 26 10:56:35 [INFO    ] Loading cache from salt://_grains, for base)
Oct 26 10:56:35 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:35 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:35 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:37 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:37 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:40 [INFO    ] Caching directory '_grains' for environment 'base'
Oct 26 10:56:40 [DEBUG   ] Local cache dir: '/var/cache/salt/files/base/_grains'
Oct 26 10:56:40 [INFO    ] Creating module dir '/var/cache/salt/extmods/renderers'
Oct 26 10:56:40 [INFO    ] Syncing renderers for environment 'base'
Oct 26 10:56:40 [INFO    ] Loading cache from salt://_renderers, for base)
Oct 26 10:56:40 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:41 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:41 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:43 [INFO    ] User salt Executing command saltutil.find_job with jid 20121026105642467539
Oct 26 10:56:43 [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20121026105642467539', 'tgt': 'cleg*', 'ret': '', 'user': 'salt', 'arg': ['20121026105606943737'], 'fun': 'saltutil.find_job'}
Oct 26 10:56:43 [INFO    ] Executing command 'ps -efH' in directory '/root'
Oct 26 10:56:43 [DEBUG   ] output: UID        PID  PPID  C STIME TTY          TIME CMD
[…]
Oct 26 10:56:43 [INFO    ] Returning information for job: 20121026105642467539
Oct 26 10:56:43 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:43 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:46 [INFO    ] Caching directory '_renderers' for environment 'base'
Oct 26 10:56:46 [DEBUG   ] Local cache dir: '/var/cache/salt/files/base/_renderers'
Oct 26 10:56:46 [INFO    ] Creating module dir '/var/cache/salt/extmods/returners'
Oct 26 10:56:46 [INFO    ] Syncing returners for environment 'base'
Oct 26 10:56:46 [INFO    ] Loading cache from salt://_returners, for base)
Oct 26 10:56:46 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:47 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:47 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:49 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:49 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:52 [INFO    ] Caching directory '_returners' for environment 'base'
Oct 26 10:56:52 [DEBUG   ] Local cache dir: '/var/cache/salt/files/base/_returners'
Oct 26 10:56:52 [INFO    ] Loading fresh modules for state activity
Oct 26 10:56:52 [INFO    ] Fetching file 'salt://motd.sls'
Oct 26 10:56:52 [INFO    ] Fetching file 'salt://motd/init.sls'
Oct 26 10:56:52 [DEBUG   ] Jinja search path: '/var/cache/salt/files/base'
Oct 26 10:56:52 [INFO    ] Executing state file.absent for /etc/motd.tail
Oct 26 10:56:52 [INFO    ] No changes made for /etc/motd.tail
Oct 26 10:56:52 [INFO    ] Executing state file.managed for /etc/motd
Oct 26 10:56:52 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:53 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:53 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:55 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:55 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:58 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:56:59 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:56:59 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:57:01 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:57:01 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:57:03 [INFO    ] User salt Executing command saltutil.find_job with jid 20121026105703268745
Oct 26 10:57:03 [DEBUG   ] Command details {'tgt_type': 'glob', 'jid': '20121026105703268745', 'tgt': 'cleg*', 'ret': '', 'user': 'salt', 'arg': ['20121026105606943737'], 'fun': 'saltutil.find_job'}
Oct 26 10:57:03 [INFO    ] Executing command 'ps -efH' in directory '/root'
Oct 26 10:57:03 [DEBUG   ] output: UID        PID  PPID  C STIME TTY          TIME CMD
[…]
Oct 26 10:57:04 [INFO    ] Returning information for job: 20121026105703268745
Oct 26 10:57:04 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:57:05 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:57:05 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:57:07 [DEBUG   ] Decrypting the current master AES key
Oct 26 10:57:07 [DEBUG   ] Loaded minion key: /etc/salt/pki/minion.pem
Oct 26 10:57:10 [INFO    ] Fetching file 'salt://motd/tail.j2'
Oct 26 10:57:10 [DEBUG   ] Jinja search path: '/var/cache/salt/files/base'
Oct 26 10:57:10 [INFO    ] No changes made for /etc/motd
Oct 26 10:57:10 [INFO    ] Returning information for job: 20121026105606943737

During this run, it loaded the minion key 31 times, decrypted the master key
20 times, and ran /bin/ps 4 times, while it executed saltutil.find_job 4 times
as well.

So I think there is a whole lot of duplicate work being done, but foremost,
a lot of that work seems to be completely redundant crypto work.

The minion maintains a connection with the master. The really expensive step
is setting up that connection using public-key crypto. What follows should be
AES using a session key without the need to ever touch the keys on disk again
while the connection is up.

@thatch45

This comment has been minimized.

Show comment
Hide comment
@thatch45

thatch45 Oct 26, 2012

Member

Thanks, I will look into this

Member

thatch45 commented Oct 26, 2012

Thanks, I will look into this

@madduck

This comment has been minimized.

Show comment
Hide comment
@madduck

madduck Jun 10, 2013

Contributor

Any news for this issue?

Contributor

madduck commented Jun 10, 2013

Any news for this issue?

@JeremieCharest

This comment has been minimized.

Show comment
Hide comment
@JeremieCharest

JeremieCharest Jan 30, 2015

Contributor

I'm still able to reproduce it with 2014.7.1 even when forcing an agressive 1 minute keep-alive probe. I have a mine function schedule at every minute and it generated 3 decryptions for a single job.

I'm not sure, I just read code from the github, but I think that the minion key isn't cached (salt.crypto::get_keys). In my case it's was loaded 4 times and decrypted at least 3 without counting all the master key operations.

salt-minion -l debug (minion is installed on the salt-master)

[INFO    ] User xxx Executing command mine.get with jid 20150130184934263142
[DEBUG   ] Command details {'tgt_type': 'grain', 'jid': '20150130184934263142', 'tgt': 'roles:master', 'ret': '', 'user': 'xxx', 'arg': ['*', 'management.healthcheck'], 'fun': 'mine.get'}
[INFO    ] Starting a new job with PID 4452
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[INFO    ] Returning information for job: 20150130184934263142
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
Contributor

JeremieCharest commented Jan 30, 2015

I'm still able to reproduce it with 2014.7.1 even when forcing an agressive 1 minute keep-alive probe. I have a mine function schedule at every minute and it generated 3 decryptions for a single job.

I'm not sure, I just read code from the github, but I think that the minion key isn't cached (salt.crypto::get_keys). In my case it's was loaded 4 times and decrypted at least 3 without counting all the master key operations.

salt-minion -l debug (minion is installed on the salt-master)

[INFO    ] User xxx Executing command mine.get with jid 20150130184934263142
[DEBUG   ] Command details {'tgt_type': 'grain', 'jid': '20150130184934263142', 'tgt': 'roles:master', 'ret': '', 'user': 'xxx', 'arg': ['*', 'management.healthcheck'], 'fun': 'mine.get'}
[INFO    ] Starting a new job with PID 4452
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
[INFO    ] Returning information for job: 20150130184934263142
[DEBUG   ] Decrypting the current master AES key
[DEBUG   ] Loaded minion key: /etc/salt/pki/minion/minion.pem
@madduck

This comment has been minimized.

Show comment
Hide comment
@madduck

madduck Jan 30, 2015

Contributor

Yeah, as sad as it sounds, but in 2.5 years since I first looked at Salt and noted a few serious problems (e.g. also #2361 and #3436), I don't think the team has done anything in this direction. Salt is still NIH-ridden and especially its unreliable transport mechanism (incl. crypto) have made me look elsewhere… sadly, because I liked the community :/

Contributor

madduck commented Jan 30, 2015

Yeah, as sad as it sounds, but in 2.5 years since I first looked at Salt and noted a few serious problems (e.g. also #2361 and #3436), I don't think the team has done anything in this direction. Salt is still NIH-ridden and especially its unreliable transport mechanism (incl. crypto) have made me look elsewhere… sadly, because I liked the community :/

@stale

This comment has been minimized.

Show comment
Hide comment
@stale

stale bot Oct 2, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

stale bot commented Oct 2, 2017

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Oct 2, 2017

@stale stale bot closed this Oct 9, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment