Increase ProxyCommand performance, reduce CPU utilization #454

Closed
wants to merge 1 commit into
from

Projects

None yet

3 participants

@jwm
Contributor
jwm commented Dec 12, 2014

This should fix #420:

  • Passing the remaining time as the timeout argument to select() keeps it from blocking beyond the allowed time
  • Reading >1 bytes at a time lowers CPU consumption. AFAIK, read()'s length argument is the maximum number of bytes to return, and as long as there is at least one byte of data available, read() will return immediately, without blocking. Thus, since select() just ensured that data is available on that file descriptor, this shouldn't blow past the timeout.

I also moved to time.time() for the other performance reasons cited in #420; I'm not sure if this has a meaningful impact on Windows, since I think time.time() has millisecond resolution there, not microsecond.

@jwm jwm read in >1 byte chunks, and set a useful timeout on select()
this way, we're not rolling this loop over nearly as much, while still
preserving the overall timeout semantics
d02422b
@jwm
Contributor
jwm commented Dec 12, 2014

These changes made transferring a few mbytes of data over a ProxyCommand session go from an elapsed time of tens of minutes to under a minute.

@lndbrg
Contributor
lndbrg commented Dec 13, 2014

Nice, would be good to get someone to test on windows too.

@bitprophet
Member

@jwm - Thanks for this. Can you elaborate on why using the timeout arg to select is/is not better than the sleep approach seen in the older #414 patch?

The time.time and read changes seem like they could apply regardless, though.

@lndbrg I'm not actually sure this functionality works on Windows at all, given select claims it only works on sockets, not pipes, on Windows? Forget if we've explicitly confirmed this though. (TBH I tend to be happy letting Windows support slide until an interested Windows user is available to confirm brokenness/fixes.)

@bitprophet
Member

As per this comment on #414 I determined yup, this approach is noticeably more efficient than the sleep approach from #414. Either one is way better than before, but we might as well go with this one since it's nearly 10x faster.

@bitprophet bitprophet added a commit that closed this pull request Dec 18, 2014
@bitprophet bitprophet Changelog re #413, closes #454 741b4fb
@bitprophet
Member

Cherry-picked back to the 1.13 branch, then merged upwards. Thanks again!

@jwm
Contributor
jwm commented Dec 19, 2014

Rad. Thanks very much, Jeff. I love having changes merged so quickly. :-)

The select() approach is probably better for throughput too - when you enter a sleep, you're committed for sleeping for that entire duration. The select() timeout lets you wake up before the timeout expires when the system receives new socket data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment