Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.settimeout() conusming a lot of cpu #116

Closed
oxtoacart opened this issue Feb 6, 2014 · 10 comments
Closed

socket.settimeout() conusming a lot of cpu #116

oxtoacart opened this issue Feb 6, 2014 · 10 comments
Assignees
Milestone

Comments

@oxtoacart
Copy link

I've been doing a little profiling of fteproxy and noticed that in my YouTube streaming test, fteproxy was spending nearly 18% of its time on calling 'socket.settimeout()'.

Looking at the code, I noticed that we have several of these calls on the critical path, such as this one and this one.

For fun, I tried increasing the default timeout from 0.01 to 0.1. This caused my cpu utilization to drop from 35% to 9% while playing a YouTube video.

I then implemented some logic to settimeout only if it's actually changed. That was good for another couple of percent, bumping me down to 7% cpu usage.

Using just the conditional setting trick puts me at about 20% cpu usage.

I don't know what other effects these changes have, but doing something about the excessive settimeout calls might be worth it.

@oxtoacart
Copy link
Author

Increasing the select_timeout in recvall_from_socket() helped too. With timeouts increased and the conditional settimeout logic, I'm down to around 5% cpu utilization.

@oxtoacart
Copy link
Author

Would be good to get independent approval of this, btw, as I'm a little surprised by the size of these gains.

kpdyer added a commit that referenced this issue Feb 7, 2014
@kpdyer
Copy link
Owner

kpdyer commented Feb 7, 2014

Your analysis is absolutely spot on. In retrospect it's really silly to call settimeout like that. Thanks for spotting that!

See my commit against this issue. Let me know how it compares to what you've implemented. Will keep this open until you confirm.

@kpdyer
Copy link
Owner

kpdyer commented Feb 7, 2014

Ooops. My first commit broke fteproxy on OSX. Try the latest, and let me know if that works for you!

@lanterndev
Copy link
Contributor

Just built successfully from 8fc0a74 on a Macbook Air running 10.9, and all tests pass with the following results:

> make test
testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok

----------------------------------------------------------------------
Ran 14 tests in 7.672s

OK

Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
 + SUCCESS, format_name="manual-http"
    - client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
    - server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
    - test duration: 6.29 seconds
    - goodput: 81.37Mbps
 + SUCCESS, format_name="dummy"
    - client-server regex: "^\C+$"
    - server-client regex: "^\C+$"
    - test duration: 6.79 seconds
    - goodput: 75.35Mbps
 + SUCCESS, format_name="manual-smb"
    - client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - test duration: 6.87 seconds
    - goodput: 74.54Mbps
 + SUCCESS, format_name="manual-ssh"
    - client-server regex: "^SSH\-2\.0\C*$"
    - server-client regex: "^SSH\-2\.0\C*$"
    - test duration: 7.25 seconds
    - goodput: 70.66Mbps
****** ALL TESTS COMPLETE, SUCCESS!!

When I revert to 9f1b9e7, the tests give these results:

testLTB (fte.tests.bit_ops.TestEncoders) ... ok
testLTB2 (fte.tests.bit_ops.TestEncoders) ... ok
testLTB3 (fte.tests.bit_ops.TestEncoders) ... ok
testRegexEncoderRequest (fte.tests.encoder.TestEncoders) ... ok
testEncryptDecryptOneBlock (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_1 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptDecrypt_2 (fte.tests.encrypter.TestEncoders) ... ok
testEncryptNoOp (fte.tests.encrypter.TestEncoders) ... ok
testOneStream (fte.tests.relay.TestRelay) ... ok
testTenSerialStreams (fte.tests.relay.TestRelay) ... ok
testReclayer_basic (fte.tests.record_layer.TestEncoders) ... ok
testReclayer_concat (fte.tests.record_layer.TestEncoders) ... ok
testUnrankRank (fte.tests.dfa.TestDFA) ... ok
testMakeDFA (fte.tests.cDFA.TestcDFA) ... ok

----------------------------------------------------------------------
Ran 14 tests in 7.311s

OK
Testing formats in /Users/_pants/Code/fteproxy/fte/defs/20131224.json
 + SUCCESS, format_name="manual-http"
    - client-server regex: "^GET\ \/([a-zA-Z0-9\.\/]*) HTTP/1\.1\r\n\r\n$"
    - server-client regex: "^HTTP/1\.1\ 200 OK\r\nContent-Type:\ ([a-zA-Z0-9]+)\r\n\r\n\C*$"
    - test duration: 1.34 seconds
    - goodput: 95.63Mbps
 + SUCCESS, format_name="dummy"
    - client-server regex: "^\C+$"
    - server-client regex: "^\C+$"
    - test duration: 1.37 seconds
    - goodput: 93.66Mbps
 + SUCCESS, format_name="manual-smb"
    - client-server regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - server-client regex: "^\x00\x00\x00\x7c\xFF\x53\x4d\x42[\x25\x72]\x00\x00\x00\x00\C{115}$"
    - test duration: 1.45 seconds
    - goodput: 88.23Mbps
 + SUCCESS, format_name="manual-ssh"
    - client-server regex: "^SSH\-2\.0\C*$"
    - server-client regex: "^SSH\-2\.0\C*$"
    - test duration: 1.33 seconds
    - goodput: 96.12Mbps
****** ALL TESTS COMPLETE, SUCCESS!!

So test durations have increased and goodput went down... maybe that's expected?

When I tested the integration with our FTEMain proxy before and after (by loading (and reloading) our customary heavy Wikipedia page), the new version performed much better, so that's the important thing. Looking better to you, @oxtoacart?

Thanks for the quick fixes on this, @kpdyer!

@oxtoacart
Copy link
Author

Yes, just pulled and tested with our FTEMain, and cpu utilization is near what I was seeing with my hacks.

@kpdyer - thanks for the quick turnaround!

kpdyer added a commit that referenced this issue Feb 8, 2014
@kpdyer
Copy link
Owner

kpdyer commented Feb 8, 2014

Actually, I think I have one more improvement in this space.

@Skivvies @oxtoacart Can you try 5eb380e with values of 0.01 or 0.001 for runtime.fte.relay.socket_timeout in fte/conf.py?

@kpdyer kpdyer reopened this Feb 8, 2014
@oxtoacart
Copy link
Author

@kpdyer This last change drops CPU utilization even further! From now on I'm leaving fteproxy bugs open, just to see what you'll come up with.

@kpdyer kpdyer added this to the 0.2.7 milestone Feb 8, 2014
@kpdyer kpdyer self-assigned this Feb 8, 2014
@kpdyer kpdyer modified the milestone: 0.2.7 Feb 8, 2014
@myleshorton
Copy link

Just wanted to chime in to say bravo on the great collaboration everyone. Does anyone say "bravo" these days? Somehow makes me feel like I should be retiring =). Good stuff.

@kpdyer
Copy link
Owner

kpdyer commented Feb 13, 2014

Going to set runtime.fte.relay.socket_timeout to 0.001, as that works best for me.

Going to close this ticket for now.

@kpdyer kpdyer closed this as completed Feb 13, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants