-
Notifications
You must be signed in to change notification settings - Fork 22
Make send thread safe #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make send thread safe #53
Conversation
|
i mentioned this over in an open issue about this point: #12 (comment) tldr; the client is fairly light weight, you can simply use a separate client for every thread |
Just curious then why m_buffer was made mutable and memory is reserved in construction, instead of just a local buffer variable. cpp-statsd-client/include/cpp-statsd-client/StatsdClient.hpp Lines 199 to 200 in 6efdb16
Would a nice middle-ground be a way to turn on thread safety if someone needs it? Or the alternative could be to move m_buffer to a local variable in |
|
everything was done with the idea of performance in mind. doing a reservation upfront and allowing us to modify it on send lets us reuse that chunk of memory and avoid de/reallocs. we want the client to be as fast as possible, since its basically like a logger but pushing logs over the network. no one wants a logger that they have to worry about perfwise everyone assumes its basically free. so yeah that, at least in my opinion, was a design goal |
|
yeah we totally could have opt-in thread safety. we could also just perf test this and see if modern compilers are smart enough to know when its use is in a single threaded context |
|
Alright, I did some benchmarking. TLDR; Using local buffer is much better than std::mutex, and does not add much performance overhead over a mutable class variable, while providing thread safety. This is the test script Here are the results: Test 2: no mutex, 10 threads (causes data race issues leading to stat parsing issues - Test 3: std::mutex, 1 thread Test 4: std::mutex, 10 threads (no data races, obviously) Test 5: Local string buffer, 1 thread Test 6: Local string buffer, 10 threads (no data races) |
|
And if you ended up instantiating StatsdClient every time you sent it (with mutable std::string m_buffer) it would be 10x slower |
|
Note that all of this testing was done with a Telegraf server running. |
| std::string buffer; | ||
| buffer.reserve(256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i still cant believe this would be faster than keeping the buffer around! allocating once in the constructor and then using clear (which is O(1)) should always be faster than this.
maybe we could just make a static thread local buffer to achieve the same result and avoid the allocations/deallocations? something like:
| std::string buffer; | |
| buffer.reserve(256); | |
| // the thread keeps this buffer around and reuses it, clear should be O(1) and reserve should only have to do so the first time, after that its a no-op | |
| static thread_local std::string buffer; | |
| buffer.clear(); | |
| buffer.reserve(256); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't though. Single threaded performance of allocating once in the constructor is 3673.82 ms, whereas local allocation is 4002.7 ms. And we can't trust the multi-threaded performance of allocating in the constructor because it does the wrong thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah, static thread_local is a lot better.
user@group:~$ g++ -o statsd_example statsd_example.cpp -I./cpp-statsd-client/ -lpthread && ./statsd_example 1 1000000
Total metrics sent: 1000000
Time taken: 2750.9 ms
Metrics sent per second: 363517
user@group:~$ g++ -o statsd_example statsd_example.cpp -I./cpp-statsd-client/ -lpthread && ./statsd_example 10 100000
Total metrics sent: 1000000
Time taken: 547.708 ms
Metrics sent per second: 1.82579e+06
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we can't trust the multi-threaded performance of allocating in the constructor because it does the wrong thing.
Yeah, I wasn't concerned about multithreaded perf because testing thread-unsafe code with threads makes no sense 😄
|
@AadityaRavindran can you merge master into this branch |
kevinkreiser
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks for trying my suggestion!
|
Thank you for the quick responses! |
Running sending metrics through TSan reveals data races when we try to clear the buffer. Adding a mutex lock makes it thread safe.