New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sleep for 0.1s while waiting for command output (fixes ProxyCommand CPU usage) #414

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
4 participants
@jasondunsmore

jasondunsmore commented Oct 3, 2014

Avoid using 100% CPU by waiting for 0.1 seconds while waiting for command
output in ProxyCommand.recv().

@jasondunsmore

This comment has been minimized.

jasondunsmore commented Oct 3, 2014

This is a fix for #413

@lndbrg

This comment has been minimized.

Contributor

lndbrg commented Oct 5, 2014

Could you add some comments in the code documenting why we are doing this?

@lndbrg lndbrg added the Bug label Oct 5, 2014

@jasondunsmore

This comment has been minimized.

jasondunsmore commented Oct 6, 2014

@lndbrg Done. Please see my latest patch.

@lndbrg lndbrg added the Reviewed label Oct 6, 2014

@bitprophet

This comment has been minimized.

Member

bitprophet commented Oct 7, 2014

Does it make sense to use the built-in IO sleep setting (paramiko.common.io_sleep) instead of a hardcoded value?

That setting is used for nearly identical purposes (polling) in an upstream lib (Fabric) as well as the agent module. When we set it up for Fabric it was found that the current default (0.01s, vs your 0.1s) had decent behavior re: combining responsiveness with efficiency. (I.e. 0.01s was a significant CPU savings over having no sleep, but 0.1s or so didn't bring CPU usage down much further and was noticeably less responsive.)

At the very least it feels worth trying with 0.01s and comparing for this specific use case.

@bitprophet bitprophet added this to the 1.15.2 et al milestone Oct 7, 2014

@lndbrg

This comment has been minimized.

Contributor

lndbrg commented Oct 7, 2014

It certainly makes sense to use that instead! Totally slipped my mind.

@lndbrg

This comment has been minimized.

Contributor

lndbrg commented Oct 7, 2014

@jasondunsmore could you patch up according to @bitprophet comment? :)

@jasondunsmore

This comment has been minimized.

jasondunsmore commented Oct 7, 2014

@lndbrg Yep. I have a patch. Just need to test it. Will submit it as soon as I'm done.

@jasondunsmore

This comment has been minimized.

jasondunsmore commented Oct 7, 2014

Yappi report with sleep time of 0.1:

Clock type: CPU
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..ket.py:333 Packetizer.read_message  6/5    0.000295  0.054750  0.009125
../packet.py:185 Packetizer.read_all  12/11  0.004959  0.054132  0.004511
..miko/proxy.py:74 ProxyCommand.recv  313..  0.018856  0.048483  0.000155
..:1450 Transport._send_user_message  4      0.000074  0.003174  0.000793
..nnel.py:1039 Channel._handle_close  1      0.000035  0.003031  0.003031
..rt.py:1447 Transport._send_message  4      0.000021  0.003029  0.000757
..ket.py:289 Packetizer.send_message  4      0.000163  0.003008  0.000752
..paramiko/packet.py:39 compute_hmac  10     0.002102  0.002995  0.000299
..ient.py:322 SSHClient.exec_command  1      0.000067  0.002396  0.002396
..port.py:590 Transport.open_session  1      0.000011  0.001712  0.001712
..port.py:657 Transport.open_channel  1      0.000153  0.001701  0.001701
..util.py:315 constant_time_bytes_eq  6      0.000543  0.000953  0.000159
[...]

Yappi report with sleep time of 0.01:

Clock type: CPU
Ordered by: totaltime, desc

name                                  ncall  tsub      ttot      tavg      
..ket.py:333 Packetizer.read_message  6/5    0.000253  0.177254  0.029542
../packet.py:185 Packetizer.read_all  12/11  0.004654  0.176926  0.014744
..miko/proxy.py:74 ProxyCommand.recv  307..  0.055892  0.171120  0.000557
..ient.py:322 SSHClient.exec_command  1      0.000063  0.002291  0.002291
..port.py:590 Transport.open_session  1      0.000012  0.001539  0.001539
..port.py:657 Transport.open_channel  1      0.000161  0.001526  0.001526
..:1450 Transport._send_user_message  4      0.000071  0.001124  0.000281
..rt.py:1447 Transport._send_message  4      0.000022  0.000983  0.000246
..ket.py:289 Packetizer.send_message  4      0.000144  0.000961  0.000240
..paramiko/packet.py:39 compute_hmac  10     0.000099  0.000913  0.000091
..nnel.py:1039 Channel._handle_close  1      0.000024  0.000798  0.000798
..util.py:315 constant_time_bytes_eq  6      0.000430  0.000783  0.000131
..py:420 Packetizer._check_keepalive  295    0.000652  0.000652  0.000002
..python2.7/hmac.py:27 HMAC.__init__  10     0.000304  0.000599  0.000060
..amiko/channel.py:51 Channel._check  1      0.000022  0.000547  0.000547
[...]
@@ -93,6 +95,11 @@ def recv(self, size):
# timeouts; this makes us act more like a real socket
# (where timeouts don't actually drop data.)
self.buffer.append(b)
# If we haven't recieved any data yet, wait 0.1

This comment has been minimized.

@bitprophet

bitprophet Oct 8, 2014

Member

Comment now disagrees with the actual value! Maybe just say "wait a short period of time" to avoid the problem entirely?

This comment has been minimized.

@jasondunsmore

jasondunsmore Oct 9, 2014

Fixed that. Thanks.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Oct 8, 2014

Not familiar with Yappi, does that bear out my memory? Assuming 'yes' since you updated the PR? :) Left a brief note but otherwise this looks good, thanks!

Sleep for 0.1s while waiting for command output
Avoid using 100% CPU by waiting for 0.01 seconds while waiting for command
output in ProxyCommand.recv().
@jasondunsmore

This comment has been minimized.

jasondunsmore commented Oct 9, 2014

@bitprophet Yep, it seemed to be worth the lower wait time. 0.17s CPU time with 0.01s wait vs 0.05s CPU time with a 0.1s wait vs 30s CPU time with no wait.

@bitprophet bitprophet changed the title from Sleep for 0.1s while waiting for command output to Sleep for 0.1s while waiting for command output (fixes ProxyCommand CPU usage) Oct 15, 2014

@acdha

This comment has been minimized.

acdha commented Nov 7, 2014

This doesn't seem to be particularly effective in my testing – it went from 100% to 98% CPU usage because it's essentially a select() + single byte read() loop. I think fixing that would require making a blocking read with a timeout so you could set the maximum delay while reading considerably more than one byte when more data is available.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Nov 12, 2014

I still need to integrate & test this on my end but I'll probably ask you for a test patch if my testing bears out your assertion, @acdha :) it sounded like the overall "rub some sleep on it" approach had big dividends for @jasondunsmore though.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Dec 16, 2014

@acdha's notes seem to have turned into #454 which is similar to this but uses the select+ timeout option, plus a few other orthogonal changes which we should rub onto this one here if we decide to go with the sleep option.

@bitprophet

This comment has been minimized.

Member

bitprophet commented Dec 18, 2014

Doing my own tests on this with both the sleep option & the timeout option, using the sample yappi-using snippet provided by @jasondunsmore in the desc of #413.

Hitting one of my prod servers via my prod bastion system from my localhost, with no changes applied, bears out the "all 30 wallclock seconds are spent in recv":

name                                  ncall  tsub      ttot      tavg    
..miko/proxy.py:73 ProxyCommand.recv  321..  13.68937  30.96013  0.096449
..ket.py:333 Packetizer.read_message  6/5    0.000225  30.36787  5.061312
../packet.py:185 Packetizer.read_all  12/11  0.000000  30.36775  2.530646
..ient.py:322 SSHClient.exec_command  1      0.000071  0.003540  0.003540

With this PR (#414) applied, using the 0.01s io_sleep value, I confirm Jason's results, with recv still tops but only active for 0.46 seconds instead of 30ish:

name                                  ncall  tsub      ttot      tavg      
..miko/proxy.py:75 ProxyCommand.recv  303..  0.189383  0.464352  0.001533
..ket.py:333 Packetizer.read_message  6/5    0.000359  0.462989  0.077165
../packet.py:185 Packetizer.read_all  12/11  0.000000  0.462476  0.038540
..ient.py:322 SSHClient.exec_command  1      0.000074  0.003831  0.003831

With #454 applied (this includes not only the select timeout but also a more efficient timekeeping function call & reading >1 byte at a time) things are another order of magnitude faster, only active for 0.057 seconds:

name                                  ncall  tsub      ttot      tavg    
..ket.py:333 Packetizer.read_message  6/5    0.000349  0.057388  0.009565
../packet.py:185 Packetizer.read_all  12/11  0.008463  0.055538  0.004628
..miko/proxy.py:74 ProxyCommand.recv  318..  0.022239  0.044484  0.000140
..ient.py:322 SSHClient.exec_command  1      0.000067  0.003249  0.003249

If I steal the orthogonal improvements from #454 and apply them to #414, we see an improvement, but not nearly as marked as with the select-timeout approach ("only" 0.36 seconds), implying the select-timeout approach is significantly better than sleeping (though we could potentially bump the sleep around, I doubt it'd ever get that good?):

name                                  ncall  tsub      ttot      tavg    
..ket.py:333 Packetizer.read_message  6/5    0.000381  0.363533  0.060589
../packet.py:185 Packetizer.read_all  12/11  0.000000  0.361403  0.030117
..miko/proxy.py:75 ProxyCommand.recv  300..  0.151985  0.360740  0.001202
..ient.py:322 SSHClient.exec_command  1      0.000089  0.004060  0.004060

I went back and ran the test a few more times for each setup to confirm things:

  • #414 by itself: 0.44, 0.46, 0.49
  • #414 w/ cherry-picked #454 changes: 0.36, 0.38, 0.40
  • #454: 0.06, 0.06, 0.06

So while either approach is a huge fix, at this point it looks like the select timeout approach is the best, so I'll merge that (closing this) & give credit for all involved.

@bitprophet bitprophet closed this Dec 18, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment